by J.M. Vanel , Copyright © J.M. Vanel - 2001-2002 Back to main page
Last update:
First you need a Java runtime (JRE) or JDK; I use a 1.4.0 JDK; it probably still works with 1.3 . See below for ressource URL's.
We thrive to make installation easier, our goal is "no installation".
We hope that an Ant build.xml
file will be able to fetch everything.
But for now, this is it:
EDITOR
environment variable to your programmer's
editor suitable for XML files (must use a DTD-less editor for now,
since we haven't done a DTD!); on Linux I put gvim; if you have none on
Windows just put notepad
or write
in the distribution directory type : ant
Below you will find examples of environment variables settings for different OS.
In addition for desktop publishing connectors you will need one or more of the following (optional, not needed if your source documents are regular XHTML) :
Spreadsheet::ParseExcel
; get them from
Comprehensive Perl Archive Network (CPAN) ; download and install in this
order:
IO-stringy-1.220
OLE-Storage_Lite-0.08
Spreadsheet::ParseExcel
empty_project/
into a new directory
(no space in directory name), say myProject/
myProject/
any number of source documents with
the following suffixes:
.table.htm
or .table.html
for table
structure.parag.htm
or .parag.html
for paragraphs
and sub-paragraphs delimited par h1 and h2 titles.xls
file for MS Excel filesant
" in a shell (Unix or Cygwin or DOS) in myProject/
directoryUSER_SETTINGS/thesaurus.xml
file in
the newly appeared window; I reassure you, it works unchangedhtml/index.html
USER_SETTINGS/
directory (see below "Customization"
)html
directory to your Web siteant clean
" in a shell , or just "ant
clean-html
"In any situation, you can directly type ant
. XMLPublication
will start an editor for USER_SETTINGS/thesaurus.xml
if it sees
that it might not be up-to-date. See below for sample files.
USER_SETTINGS/
subdirectory;thesaurus.xml
is generated
by XMLPublication, containing all the words in each rubric. Since this
certainly too much for most documents, you will want to remove some or
many keywords by editing thesaurus.xml. Although this is the normal way
to do, you can also reuse index entries from books or other documents
in the same domain of knowledge.After any change, just type ant
in a shell to remake the
Web pages.
:
<rubric use-keywords='no' ...
<w>
elementsTo help you edit this file you can consult the USER_SETTINGS/publication.xml
file which is an exact image of the index entries present in all the
document source file(s) . The first time the project is processed, thesaurus.xml
is created from publication.xml. Afterwards, thesaurus.xml is never modified
by the framework, but publication.xml is always updated with respect
to document sources.
USER_SETTINGS/item_file_name.xslt
file. For a first trial you can consult the kernel/item_file_name-generic.xslt
file which works with any source document, although the item_name
,
which will be displayed in the HTML pages is not satisfying for all documents;ant
" ; only the necessary files will
be updated.class
attributes. This
allows to associate CSS styles to specific elements of the Web pages.
This is the standard way to specify colors, fonts, and styles in Web
pages and all HTML editors support this. In XMLPublication the best way
is to define styles in the site specific XHTML wrapper (presentation.html)
.<merge-items according-to-label="my-rubric-label" group-by="rubric|source"
/>
and type in the shell "ant all-merge
" .
Instead of the default, which is to create one HTML page per item, this will generate one HTML page for each different value of the specified rubric, thus merging items from different document sources.
If your source documents are regular XHTML (like those produced by Amaya), they must have suffix .xhtml
or .html . in the work/
directory of your XMLPublication project.
Files ending with .html and .xhtml are considered OK and are simply copied
in $WORK directory.
Files ending with .htm are considered non-XHTML files and a corresponding .html is generated in $WORK directory; the tool does an automatic tidy for MS Word and other editors producing non-XHTML files. So if you have an MS Word file file.htm containing paragraphs and sub-paragraphs, rename it file.parag.htm before calling ant .
Currently there is no "expert system" to guess the kind of HTML! It would be difficult to develop such an "expert system" because you can have <table> inside paragraphs or the revers. So we rely on file suffixes on source documents:
The sourceDocs/
directory in the distribution contains various
source documents that you can drop into a copy of the empty_project/
directory. Any number of source documents can be put, with common rubrics
or not, with table or paragraphs initial structure, and XMLPublication will
generate consistent index pages by rubrics.
Rarely processing is blocked at the tidy stage; just kill -15 the process
or kill the window and make ant clean
.
If you type ant -logfile ant.log
, the messages of XMLPublication
will be kept in that file. You can add to that -verbose
to have
extra information about what ant does.
If you type ant -projecthel
p , you will see the intermediary
targets (steps) making the XMLPublication workflow.
If you type ant schema.xml
, you will get a digest of the
content of your source documents, combining an XML Schema-like syntax with
element count information. This is a 100% generic stylesheet (example2Schema.xslt
)
that you can reuse for any XML file!
There is also "ant statistics" to generate statistics on the number of items
(<h1> tags) and rubrics (<h2> tags) .
"ant helpers" launches schema.xml
, statistics and words-list-by-rubric.xml.
XML editors : these are good for DTD-less documents :
Recommanded (X)HTML editor :
Here is an example of a .bash_profile
file for Cygwin for
ant :
export ANT_HOME=d:\\ecoropa\\Seedsavers\\jakarta-ant-1.4.1
export JAVA_HOME=c:\\Program\ Files\\JavaSoft\\JRE\\1.3
export PATH=$(cygpath -u "$JAVA_HOME")/bin:\
$PATH:$(cygpath -u "$ANT_HOME")/bin
export EDITOR=notepad
Here is the same example for a regular Unix :
export ANT_HOME=/usr/local/jakarta-ant-1.4.1
export JAVA_HOME=/usr/local/JavaSoft/JRE/1.3
export PATH=$JAVA_HOME/bin:$PATH:$ANT_HOME/bin
export EDITOR=gvim
Here is the same example for a DOS autoexec.bat :
set ANT_HOME=d:\ecoropa\Seedsavers\jakarta-ant-1.4.1
set JAVA_HOME=c:\Program\ Files\JavaSoft\JRE\1.3
set PATH=%JAVA_HOME%\bin;%PATH%;%ANT_HOME%\bin
set EDITOR=notepad