Skip to main content

Open Information Extraction from the Web







Open Information Extraction from the Web



Michele Banko, Michael J Cafarella, Stephen Soderland, Matt Broadhead and Oren Etzioni



Turing Center

Department of Computer Science and Engineering
University of Washington











Traditionally, Information Extraction (IE) has fo-
cused on satisfying precise, narrow, pre-specified
requests from small homogeneous corpora (e.g.,
extract the location and time of seminars from a
set of announcements). Shifting to a new domain
requires the user to name the target relations and
to manually create new extraction rules or hand-tag
new training examples. This manual labor scales
linearly with the number of target relations.



This paper introduces Open IE (OIE), a new ex-
traction paradigm where the system makes a single
data-driven pass over its corpus and extracts a large
set of relational tuples without requiring
any human
input. The paper also introduces T
EXTRUNNER,
a fully implemented, highly scalable OIE system
where the tuples are assigned a probability and
indexed to support efficient extraction and explo-
ration via user queries.



We report on experiments over a 9,000,000 Web
page corpus that compare T
EXTRUNNER with
K
NOWITALL, a state-of-the-art Web IE system.
T
EXTRUNNER achieves an error reduction of 33%
on a comparable set of extractions. Furthermore,
in the amount of time it takes K
NOWITALL to per-
form extraction for a handful of pre-specified re-
lations, T
EXTRUNNER extracts a far broader set
of facts reflecting orders of magnitude more rela-
tions, discovered on the fly. We report statistics
on T
EXTRUNNER’s 11,000,000 highest probability
tuples, and show that they contain over 1,000,000
concrete facts and over 6,500,000 more abstract as-
sertions. 



Popular posts from this blog

(26) Post | LinkedIn

(26) Post | LinkedIn : ► Trump was first compromised by the Russians back in the 80s. In 1984, the Russian Mafia began to use Trump real estate to launder money and it continued for decades. In 1987, the Soviet ambassador to the United Nations, Yuri Dubinin, arranged for Trump and his then-wife, Ivana, to enjoy an all-expense-paid trip to Moscow to consider possible business prospects. Only seven weeks after his trip, Trump ran full-page ads in the Boston Globe, the NYT and WaPO calling for, in effect, the dismantling of the postwar Western foreign policy alliance. The whole Trump/Russian connection started out as laundering money for the Russian mob through Trump's real estate, but evolved into something far bigger. ► In 1984, David Bogatin — a Russian mobster, convicted gasoline bootlegger, and close ally of Semion Mogilevich, a major Russian mob boss — met with Trump in Trump Tower right after it opened. Bogatin bought five condos from Trump at that meeting. Those condos were...