State of WWBKB project
J.M. Vanel - 2003-03-05 (
jmvanel@free.fr)
The
application , a
Flora of China
search
engine , runs on my personal machine (only 500MHz
:-( ) . The server is up
most of the time, but it's my development machine ...
The initial vision (1999) of the project is here :
http://wwbota.free.fr/call.htm
On the site
http://wwbota.free.fr/ you will find
complete informations on the project.
The search engine allows
queries à
La Google
on the species description
, and more generally on large XML databases. It allows to search
character strings in specific organs or rubric (e.g. "flower:red
leaf:large"). These simple queries are then translated
into
XPath queries, then processed by the eXist XML database, and finally
formatted through Cocoon, an XML publication framework from
Apache.org . It is 100% open source. It is managed by a community of
developers on Sourceforge.net .
Upstream from that there is a syntaxic parser, FloraParse, writen in
C++ with Lex &Yacc, which uses the WordNet semantic dictionary from
Princeton University. FloraParse transforms Natural Language
descriptions into an XML format where informations are marked as
organs, sub-organ, geography, etc.
All this is completely operational.
WWBKB is on some points ahead compared to the efforts of academical
taxonomists. Collaborations include :
- Taxonomic Databases Working Group ( http://www.tdwg.org
)
- Laboratoire Informatique et Systématique,
Université Pierre et Marie Curie (Paris)
- Havard University, which provided raw data from the Flore
of Chine
Jean-Marc Vanel
http://jmvanel.free.fr/
Worldwide Botanical Knowledge Base
http://wwbota.free.fr/