Skip to main content

Web-Scale Information Extraction in KnowItAll (Preliminary Results)







Web-Scale Information Extraction in KnowItAll
(Preliminary Results) 















Manually querying search engines in order to accumulate a large
body of factual information is a tedious, error-prone process of
piecemeal search. Search engines retrieve and rank potentially rel-
evant documents for human perusal, but do not extract facts, assess
confidence, or fuse information from multiple documents. This pa-
per introduces K
NOWITALL, a system that aims to automate the
tedious process of extracting large collections of facts from the web
in an autonomous, domain-independent, and scalable manner.



The paper describes preliminary experiments in which an in-
stance of K
NOWITALL, running for four days on a single machine,
was able to automatically extract 54,753 facts. K
NOWITALL asso-
ciates a probability with each fact enabling it to trade off precision
and recall. The paper analyzes K
NOWITALL’s architecture and re-
ports on lessons learned for the design of large-scale information
extraction systems. 



Popular posts from this blog

Elizabeth Holmes Discusses Theranos at WSJDLive 2015

Elizabeth Holmes Discusses Theranos at WSJDLive 2015 Elizabeth Holmes Discusses Theranos at WSJDLive 2015 At the WSJDLive 2015 conference, Theranos founder and CEO Elizabeth Holmes discusses her company's proprietary technologies, the FDA's inspection of its facilities, and the assertion that her company was too quick to market its products.