Re: [問題] Google Interview Question (4)

作者: Leon (Achilles)   2013-03-09 15:15:33
※ 引述《stimim (qqaa)》之銘言:
: 剛剛和 Leon 討論了一下,發現我們對題目的了解有一點不同
: 原題目:
: Given a document and a query of K words, how do u find the smallest window
: that covers all the words at least once in that document? (given you know the
: inverted lists of all K words, that is, for each word, you have a list of all
: its occurrrences). This one is really hard. Could someone propose an
: algorithm in O(n)?
: 我們可以確定:
: 1. document is given
: 2. 知道我們關心的是哪 K 個字
: 3. 這 K 個字的 occurrence list 也是給定的
: 問題是,n 指的是什麼?
: 1. n = document size (in number of words)
: 在這種情況下, O(n) 是可能的,方法基本上就是 Leon 提出來的。
恕刪.
我覺得那個 N 的定義比較有可能是 document length..
不過下面那個定義, 我想我的方法應該也可以啦.
Here is the modification.
: 2. n = 這 K 個字出現的總次數
: Ex:
: K = 3, 要找出包含 {a, b, c} 的最小 window
: occurrences:
: a = { 1, 1000, 2500}
: b = {400, 1001, 2000}
: c = {500, 1002, 1500}
: document size = 2500 甚至更高 (其他的位置是其他的字)
: n = 3 + 3 + 3 = 9 而不是 2500
: 在這種情況下,已知可以做到 O(n logK) (RockLee 寫的方法)
: 不過,在這種狀況下,給定原本的 document 似乎就沒什麼用了
: 而且在讀入 input 時,
: 讀入原本的 document 會花掉 O(document size) 的時間
: 所以在和 Leon 討論時,他對這種解釋方法有一點懷疑
So, when you build the occurance list from document,
actually you will have scaned this array first..
{1, 400, 500, 1000, .... 2500} which represents {a, b, c, a..}
We need to keep this too. Call it "Show_list"
Then, use my algorithm, the left boundary are choosen sequentally
from "Show_list".
The first step will be, choosing [1] as left,
and use max-min{b, c} as right.
The second step is, drop [1] and use "Show_list(2)" as a new
left boudary.
Now, you drop at most one element in the window.
The need O(1) to find the right boundary and get back this element.
Hope it helps.

Links booklink

Contact Us: admin [ a t ] ucptt.com