fwwputils 0.5.14 ReadMe.txt
===========================
by Matthew Lewis, 2007-03-31
This code is public domain (except for the install-sh script[1]), I
hope that it will be useful.
However, I make absolutely no guarantees about the suitability of this
code for any purpose whatsoever. Anyone using it does so entirely at
their own risk.
Any fixes, requests or queries can be sent to
matt026 -at- semiprime -dot- com
[1] For use and distribution information relating to the install-sh
script, see the beginning of that file.
Contents
--------
* About fwwputils
* Where to get it
* Status
* Compilation (GNU/Linux)
* Testing (GNU/Linux)
* Installation (GNU/Linux)
* Manual installation (GNU/Linux)
* Windows versions
* Compilation (MS Windows)
* Installation (MS Windows)
* Command-line syntax for converters
* Hacking/Debugging
* Distribution
* If you have any requests
* History
About fwwputils
---------------
fwwputils has two main components: libfwwp and the fwwp2* programs.
libfwwp is a shared library for reading .WP files, which in this
context means the native file format of the word-processor on the
Sharp Font Writer range of machines. FWWP = Font Writer Word-Processor.
The fwwp2* programs are a set of command-line utilities which use
libfwwp to convert .WP files into files suitable for reading on most
desktop computers. Currently the programs are:
* fwwp2txt - convert to ISO 8859-1 (or 7-bit ASCII) text format; all
formatting is lost.
* fwwp2rtf - convert to RTF format (readable by most modern
word-processors)
* fwwp2html - convert to (X)HTML format (viewable on a web-browser, as
well as many word processors)
* fwwp2tex - convert to plain-TeX format (for processing with TeX into
a typeset document).
* fwwp2latex - convert to LaTeX format (for processing with LaTeX into
a typeset document)
Where to get it
---------------
http://semiprime.com/fontwriter/
Status
------
Basically, it reads/converts all the files that I've tried it on -
files with the following signatures:
SCFWS-174FW-700 SHARP FontWriter3 (Ver.1.00)
SCFWS-187FW-750 SHARP FontWriter7 (Ver.1.10)
Text and RTF conversion are currently the most advanced. HTML
conversion tends to produce redundant empty paragraphs. Plain TeX
conversion is variable, many things work very well, some don't - in
particular line-spacing of large fonts is poor. LaTeX conversion is
generally good, but lots of formatting is lost.
I reverse-engineered the file format, but there are still a few things
that I don't know about. I do not have easy access to a machine, so
it is not easy to experiment. My conclusions about the file format
are included in FontWriterWPFormat.txt.
Here's a quick summary of what is supported. In the library the core
features are supported - apart from the clip-art, the few omissions
are due either to my ignorance of their existence, or me not being
sure how to implement them. In the converters, omissions may well
depend on my knowledge of the target format, or I may just not have
got around to them yet.
Y=yes N=no A=approximated P=partial
libfwwp txt RTF HTML LaTeX TeX
Paper size Y N/A Y N/A P Y
Margins Y N/A Y N Y Y
Default formatting Y N/A P PA P P
Header/footer Y A Y N Y Y
Formatting Y N/A P P P
Margins Y N/A Y Y N
Page numbers N/A N/A
Page offset Y N/A
Special characters P[1] PA P P P PA
Character format
Face Y N/A Y N N Y
Size Y N/A Y Y A Y
Condensed Y N/A N N N N
Bold Y N/A Y Y Y P[2]
Italic Y N/A Y Y Y P[2]
Outline Y N/A Y N N N
Underline Y N/A Y A[3] A[3] A[3]
Sup/subscript Y N/A Y Y N N
Paragraph format
Alignment Y N/A Y P Y P[4]
Line spacing Y N/A N N N N
Indents Y N/A Y N N Y
Shading Y N/A Y N N N
Tab positions/type Y N/A Y N N N
Frames P[5] N N N N N
Merge codes N[6]
Clip art N[6]
Notes:
[1] libfwwp converts special characters to Unicode/UCS-2 (16-bit).
This is currently incomplete, with many omissions, however, all
characters in the ISO 8859-1 subset (8-bit) are supported by
both the library and the conversion programs (although some
characters may be approximated by the latter). This subset
includes hard-space, soft-hyphen, tab, form-feed (page-break)
and many common accented Latin characters.
[2] When converting text that is simultaneously italic+bold into
TeX, the result is generally approximated, unless the font is
Dutch, which is converted into CM-Roman, which has a bold-italic
variant.
[3] Generally, most fancy underlines get converted into a plain
underline. RTF supports different types, and double underline
is approximated in TeX.
[4] TeX conversion only keeps justified and left-aligned text.
[5] Frame markers are detected by the library, and there is a
library function to read the frame data. However, this has not
been tested, and there is no function to read the line styles.
[6] Merge codes and clip art are the main omissions from the library.
I know how merge codes work in the .WP file, but not in the
Address Book .AB file, and am not sure how best to implement
this. I have no idea how clip art is implemented and have no
easy way of finding out.
Omissions/issues/known bugs:
* Testing (make check) in Windows Debug pops up bogus debug windows.
* The interface to FWWPFile is unsatisfactory. In particular:
- There's no way of reading more than one character at a time.
This is related to the first point, since if you read multiple
characters you'd still need to process them individually, to
check for formatting characters and special characters.
- You can't really write C programs using the library.
* Merge codes are converted to funny characters.
* TeX conversion bugs:
- No line-breaks allowed in \underbar{...}.
- pound character always same size, weight, italic.
- Character/line spacing is poor.
- Paragraph indentation is sometimes lost if group is closed
before paragraph break.
* HTML
element is kludged horribly: produces lots of
unnecessary empty paragraphs -
* Multiple spaces are eaten when converting to HTML, TeX or LaTeX.
I consider this to be the correct behaviour, since when these
files are processed, multiple spaces are ignored, so by removing them,
we produce neater converted files.
(The -e option controls space-eating in fwwp2txt and fwwp2rtf).
* Multiple space skipping isn't that smart anyway - it sometimes
doesn't skip them.
* In fwwp2tex and fwwp2latex, the smart-quote conversion isn't that
smart either.
* Command-line options I might add to converter programs:
-f: Force overwrite output file (currently refuses)
-i: ... or would interactive be better?
-p: Output preliminaries
-q: Output postliminaries
-v: Verbose output - currently failure is a bit hard to debug.
* It will not *write* .WP files. If you want this functionality,
try saving the file as WordPerfect-6.0-for-DOS or ASCII-text file
and using the Font Writer's own conversion utility (available on
the FW-760, but I'm not sure which model it was introduced on).
It is unlikely that this will ever be supported, since I would need
access to a Font Writer machine to test it.
Compilation (GNU/Linux)
-----------------------
(Tested on Slackware 11.0 GNU/Linux with gcc 3.4.6)
$ ./configure [--prefix=PREFIX]
$ make
If you want a list of options for configure do:
$ ./configure --help
Testing (GNU/Linux)
-------------------
$ make check
(Version >0.5.10: chars.wp gives lots of unknown-character warnings).
This will run all the conversion programs on a set of simple example
files. As long as you haven't made any changes, it should not fail.
If you have made any changes this is a good way of checking them.
Installation (GNU/Linux)
------------------------
First, uninstall any previous version (this probably isn't essential
unless you are upgrading from a version earlier than 0.2). Then, as
root:
# make install
or
# make install-strip
These put files in /usr/local/lib/ and /usr/local/bin/. If you can't
or don't want to install to /usr/local/, when you run .configure, use
the --prefix=PREFIX option.
To uninstall:
# make uninstall
This just deletes files - old versions are not restored.
Manual installation (GNU/Linux)
-------------------------------
All the utilities use a shared library called libfwwp.so.x.y.z, where the
x.y.z is the version number.
1. Copy the shared library libfwwp.so.x.y.z to somewhere where it will
be picked up by the shared library loader (I'd suggest /usr/local/lib/
if you have root access; somewhere in $LD_LIBRARY_PATH if you don't).
It needs read permissions - execute permissions are unnecessary.
2. Add soft-links to libfwwp.so.x.y.z called libfwwp.so, libfwwp.so.x and
libfwwp.so.x.y
3. If you're installing globally (e.g. to /usr/local/lib/) run /sbin/ldconfig
to update /etc/ld.so.cache
4. Copy the fwwp2* executables to wherever you want them (/usr/local/bin/).
5. For documentation of the conversion programs, copy the man file
man/fwwputils.1 into a man1 directory (/usr/local/man/man1/). To
get the same documentation under other names, copy man/fwwp2foo.1
into the same directory, once for each alias, giving the copied
file the name of the aliases. For example copy it as fwwp2rtf.1
and fwwp2html.1 if you want "man fwwp2rtf" and "man fwwp2html" to
work.
6. If you wish to compile any other programs that use libfwwp, copy
the header file fwwp.h into an include directory (/use/local/include/).
Windows versions
----------------
I generally use Linux, so the Linux versions will typically be the
most well-tested. Windows binaries should appear a few days after the
source-code versions - unless I need to make any corrections to get
the Windows version to work. I currently compile with MSVC, so I
would be interested to know of any changes needed for other compilers.
Compilation (MS Windows)
------------------------
If you need it, I believe that you can freely download the latest
version of Microsoft's C++ compiler (command-line tools only) from:
http://msdn.microsoft.com/visualc/vctoolkit2003/
Alternatively, you can try MinGW "Minimalist GNU for Windows":
http://www.mingw.org/
Method A:
Cheat - download binaries from http://semiprime.com/fontwriter/
Method B:
Use the provided fwwputils.mak makefile for Microsoft NMAKE (part of
MS Developer Studio). Simply cd to the source directory and type:
nmake /f fwwputils.mak
And *.exe files will be created in the Win32Release directory.
Optionally, you can run some automatic tests with:
nmake /f fwwputils.mak check
Method C: (I haven't tried this)
Use MS Developer Studio to create a project based on fwwputils.mak.
Method D: (I haven't tried this for a while)
For a general compiler, to compile fwwp2rtf.exe (statically) create a
project called fwwp2rtf, containing source files:
fwwp.cpp
fwwpchars.cpp
fwwpconv.cpp
fwwp2rtf.cpp (change this for different converters!)
fwwpoutput.cpp
and headers:
fwwp.h
fwwpconv.h
You may like to define the macro PACKAGE_VERSION to be a string
corresponding to the version number. However this is optional and
it should still work without this macro.
Installation (MS Windows)
-------------------------
To install, copy the resulting executable, fwwp2rtf.exe, to a suitable
folder (I'd suggest the "Program Files" folder). Then find a .WP file
in Windows Explorer, right-click, choose open-with and select the
fwwp2rtf executable, ticking the "always use this application" box.
To convert files after this first one, just double click in Explorer -
you'll probably need to refresh to see the *.rtf file that will appear
in the same folder. If a *.rtf file already exists in the same
folder, you'll need to rename or delete it, since fwwp2* will not
overwrite files (I may add a command-line option at some point).
To convert multiple files, highlight them all (try select-all!),
right-click and select open (or open-all?). Again you may need to
refresh to see the output files.
If it doesn't work, execute from a DOS box - this should give useful
information about the error.
Command-line syntax for converters
----------------------------------
Note this section refers to executables compiled with access to a
unistd.h header-file (e.g. on GNU/Linux). Other executables
(e.g. most MS Windows executables) have a much less flexible syntax.
Usage:
fwwp2* -V
fwwp2* [options] INFILE [-o OUTFILE]
If the output file is omitted, an output filename is generated from
the first argument. If the output filename already exists, it will
not be overwritten - an error is returned.
Options:
-n: Naked conversion - no preliminary or postliminary markup.
Additional options for some converters:
-e: eat white-space [fwwp2txt, fwwp2rtf]
-7: 7-bit ASCII [fwwp2txt]
-h: HTML 4.01 conversion [fwwp2html]
Examples:
$ fwwp2rtf MYFILE.WP
$ # Converts MYFILE.WP to MYFILE.rtf
$ fwwp2html MYFILE.WP -o converted.html
$ # Converts MYFILE.WP to converted.html (XHTML 1.0 Transitional)
$ fwwp2html -n -h MYFILE.WP -o fragment.html
$ # Output fragment.html has no preliminaries, so can be
$ # inserted in its entirety into a complete html-file.
$ # (HTML 4.01 Transitional)
$ fwwp2tex -V
$ # Prints version number and exits
$ fwwp2txt -e MYFILE.WP
$ # Converts MYFILE.WP to MYFILE.txt, but converting repeated
$ # spaces to single spaces
Hacking/Debugging
-----------------
The result of my reverse-engineering of the Font Writer format is
recorded in FontWriterWPFormat.txt. Fixes or additions welcome.
For extra debugging information, enable #define DEBUG_FWWP 1 in fwwp.h
and recompile.
If you make any changes do
$ make check
or, in Windows:
> nmake /f fwwputils.mak check
to run some tests (version >0.5.10: chars.wp gives lots of
unknown-character warnings). This basically runs all the fwwp2*
programs on the simple example files in the testfiles/ directory, and
checks the output files against the known-good versions in the same
directory. Of course, if you've improved the conversions, then the
known-good versions may be outdated - diff the files and check the
changes manually, updating the known-good output files if appropriate.
Conversely, if you make any changes, consider adding a test case - to
make sure no-one else breaks your addition. All that you need is to
add a .WP file and known-good conversions to the testfiles/ directory.
Distribution
------------
On a GNU system,
$ make dist
will create a fwwp-x.y.z.tar.bz2 file of the source-code, suitable for
distribution. Alternatively, "make disttzg" or "make "distzip".
If you have any requests
------------------------
If you wish me to fix or add to this code, then I am open to suggestions.
If you want something fixed or added, please send me a .WP file which
demonstrates your problem, and a description of what it should look
like. Minimal examples are preferred, but in many cases a real-world
example will do just as well - although I do need to know what it is
supposed to look like.
If you have a file that is not converting correctly, please send me
both input and output examples, so that I can check that I am really
reproducing the problem.
History
-------
Highlights listed here - see ChangeLog.txt for more details.
YYYY-MM-DD
2007-03-31: Version 0.5.14
Moved website to a new URL, documentation changes.
2004-05-12: Version 0.5.13
Fixes for non-ASCII header/footers, gcc 3.3.x and non-UK files.
2004-04-09: Version 0.5.12
Additions for compilation under MS-Windows. Header/footer alignment.
2004-03-28: Version 0.5.11
Fixed TeX conversion.
2004-03-21: Version 0.5.10
Many more characters converted by the library, some corrections.
2004-02-17: Version 0.5.09
Bugfixes in library, including default tab size inch-correction.
2004-02-06: Version 0.5.08
Header and footer formatting. Page margins converted by fwwp2latex.
2003-12-20: Version 0.5.7
More characters converted in TeX/LaTeX.
2003-11-26: Version 0.5.6
Library fixes and additions for files containing frames.
2003-11-18: Version 0.5.5
Improvements to lib/converters - characters, paper sizes. Bug fixes.
2003-10-22: Version 0.5.4
Fixed local install/uninstall.
2003-10-14: Version 0.5.3
Some API rationalisation. Simple test-code to test FWWPFile behaviour.
2003-10-05: Version 0.5.2
FWWPFile error-handling improvements. New FWWPFile count functions.
2003-09-30: Version 0.5.1
Fixes to new API mechanisms introduced in 0.5. Build improvements.
2003-09-23: Version 0.5
API changes: nextchar(), nextcharctrl(), format[para|char|tab]()
2003-09-15: Version 0.4.2
More improvements to character conversion. A5 size corrected.
2003-09-14: Version 0.4.1
Better character support outside of ISO 8859-1.
2003-09-13: Version 0.4
More detailed knowledge of file format, FWWPFile API additions.
2003-09-08: Version 0.3.3
-h option for HTML 4.01.
2003-09-05: Version 0.3.2
Moved to GNU autoconf, general clean-up of build-system.
2003-09-02: Version 0.3.1
-n option, fwwp2txt head/footer and -7 option, man files.
2003-08-30: Version 0.3
Improvements to command-line syntax & TeX conversion. Option -e.
2003-08-13: Version 0.2
First public release