Go to the first, previous, next, last section, table of contents.
The main feature of a2ps is its pretty-printing capabilities. Two different levels of pretty printing can be reached:
Note that the difference is up to the author of the style sheet.
a2ps is not a powerful syntactic pretty-printer: it just handles lexical structures, i.e., if in your favorite language
IF IF == THEN THEN THEN := ELSE ELSE ELSE := IF
is legal, then a2ps is not the tool you need. Indeed a2ps just looks for some keywords, or some sequences.
To provide a high degree of expressivity, Claire uses:
To achieve its goal of readability, Claire uses
More information on claire can be found on claire home page.
The language itself is not just a programming language but also covers analysis, design and implementation.
Heavy highlight uses symbols to represent common math operators.
The style sheets for77kwds and for90kwds implements keywords only,
while the style sheets for-fixed and for-free implements comments
only.
This style sheet tries to support any of the various flavors (Fortran 77/90/95, fixed or free form). For more specific uses, you should use either:
See the documentation of the style sheet fortran for more details.
See the documentation of the style sheet fortran for more details.
See the documentation of the style sheet fortran for more details.
int main (void)
it won't work. Write:
int main (void)
Whenever the changes of encoding are clear, a2ps sets itself the encoding for the parts concerned.
Tag 1 is the subject, and Tag 2 the author of the mail/news.
Note: This style sheet is _very_ difficult to write. Please don't report behavior you don't like. Just send me improvements, or write a Bison parser for mails.
This sheet was designed based on Modula 3 home page.
Implementation of the sheet based on The Oberon Reference Site.
It can be a good choice of destination language for people who want to produce text to print (e.g. pretty-printing, automated documentation etc.) but who definitely do not want to learn PostScript, nor to require the use of LaTeX.
It provides by the use of LaTeX like commands, a way to describe the pages that this program should produce.
The Python interpreter and the extensive standard library are freely available in source or binary form for all major platforms from the Python web site, and can be freely distributed.
The same site also contains distributions of and pointers to many free third party Python modules, programs and tools, and additional documentation.
The Python interpreter is easily extended with new functions and data types implemented in C or C++ (or other languages callable from C). Python is also suitable as an extension language for customizable applications.
program --help | a2ps -Ecard
Implementation of the sheet based on the Sather home page.
Heavy highlighting uses symbols for common mathematical operators.
Typical use of this style is:
diff -u old new | a2ps -Eudiff
The prologue diff helps to highlight the differences
(`a2ps -Ewdiff --prologue=diff').
wdiff. wdiff is a utility that underlines the differences
of words between to files. Where diff make only the difference between
lines that have changed, wdiff reports words that have changed inside the lines.
Typical use of this style is:
wdiff old new | a2ps -Ewdiff
wdiff can be found in usual GNU repositories. The prologue diff
helps to highlight the differences (`a2ps -Ewdiff --prologue=diff').
This section presents a few style sheets that define page description languages (compared to most other style sheet meant to pretty print source files).
The style sheet Symbol introduces easy to type keywords to obtain
the special characters of the PostScript font Symbol. The
keywords are named to provide a LaTeX taste. These keywords are also
the names used when designing a style sheet, hence to get the full list,
see section 7.6.1 A Bit of Syntax.
If you want to know the correspondence, it is suggested to print the
style sheet file of Symbol:
a2ps -g symbol.ssh
PreScript has been designed in conjunction with a2ps. Since
bold sequences, special characters etc. were implemented in a2ps, we
thought it would be good to allow direct access to those features:
PreScript became an input language for a2ps, where special
font treatments are specified in an ssh syntax (see section 7.6 Style Sheets Implementation).
The main advantages for using PreScript are:
It can be a good candidate for generation of PostScript output (syntactic pretty-printers, generation of various reports etc.).
Every command name begins with a backslash (`\'). If the command uses an argument, it is given between curly braces with no spaces between the command name and the argument.
The main limit on PreScript is that no command can be used inside
another command. For instance the following line will be badly
interpreted by a2ps:
\Keyword{Problems using \keyword{recursive \copyright} calls}
The correct way to write this in PreScript is
\Keyword{Problems using} \keyword{recursive} \copyright \Keyword{calls}.
Everything from an unquoted % to the end of line is ignored (comments).
These commands required arguments.
PreScript and a2ps can be used for one-the-fly
formating. For instance, on the `passwd' file:
ypcat passwd |
awk -F: \
'{print "\Keyword{" $5 "} (" $1 ") \rightarrow\keyword{" $7 "}"}'\
| a2ps -Epre -P
The aim of the PreTeX style sheet is to provide something similar to
PreScript, but with a more LaTeX like syntax.
`$' is ignored in PreTeX for compatibility with LaTeX,
and `%' introduces a comment. Hence they are the only symbols which
have to be quoted by a `\'. The following characters should also be
quoted to produce good LaTeX files, but are accepted by
PreScript: `_', `&', `#'.
Note that inside a command, like \textbf, the quotation
mechanism does not work in PreScript (\textrm{#$%}
writes `#$%') though LaTeX still requires quotation. Hence whenever
special characters or symbols are introduced, they should be at the
outer most level.
These commands required arguments.
Symbol).
The following symbols, inherited from the style sheet Symbol, are
not supported by LaTeX:
`\Alpha', `\apple', `\Beta', `\carriagereturn', `\Chi', `\Epsilon', `\Eta', `\florin', `\Iota', `\Kappa', `\Mu', `\Nu', `\Omicron', `\omicron', `\radicalex', `\register', `\Rho', `\suchthat', `\Tau', `\therefore', `\trademark', `\varUpsilon', `\Zeta'.
LaTeX is more demanding about special symbols. Most of them must be in so-called math mode, which means that the command must be inside `$' signs. For instance, though
If \forall x \in E, x \in F then E \subseteq F.
is perfectly legal in PreTeX, it should be written
If $\forall x \in E, x \in F$ then $E \subseteq F$.
for LaTeX. Since in PreTeX every `$' is discarded (unless quoted by a `\'), the second form is also admitted.
TeXScript is a replacement of the old version of
PreScript: it combines both the a2ps-like and the
LaTeX-like syntaxes through inheritance of both PreScript and
PreTeX.
In addition it provides commands meant to ease processing of file for a2ps by LaTeX.
Everything between `%%TeXScript:skip' and `%%TeXScript:piks'
will be ignored in TeXScript, so that there can be inserted
command definitions for LaTeX exclusively.
The commands `\textbi' (for bold-italic) and `\textsy' (for symbol) do not exist in LaTeX. They should be defined in the preamble:
%%TeXScript:skip
\newcommand{\textbi}[1]{\textbf{\textit{#1}}}
\newcommand{\textsy}[1]{#1}
%%TeXScript:piks
There is no way in TeXScript to get an automatic numbering. There is
no equivalent to the LaTeX environment enumerate. But every
command beginning by \text is doubled by a command beginning by
`\magic'. a2ps behaves the same way on both families of commands.
Hence, if one specifies that arguments of those functions should be
ignored in the preamble of the LaTeX document, the numbering is
emulated. For instance
\begin{enumerate}
\magicbf{1.}\item First line
\magicbf{2.}\item Second line
\end{enumerate}
will be treated the same way both in TeXScript and LaTeX.
`\header' and `\footer', are not understood by LaTeX.
A face is an attribute given to a piece of text, which specifies how it should look like. Since a2ps is devoted to pretty-printing source files, the faces it uses are related to the syntactic entities that can be encountered in a file.
The faces a2ps uses are:
Actually, there is also the face `Symbol', but this one is particular: it is not legal changing its font.
a2ps pretty prints a source file thanks to style sheets, one per language. In the following is described how the style sheets are defined. You may skip this section if you don't care how a2ps does this, and if you don't expect to implement new styles.
Every style sheet has both a key, and a name. The name can be clean and beautiful, with any character you might want. The key is in fact the prefix part of the file name, and is alpha-numerical, lower case, and less than 8 characters long.
Anywhere a2ps needs to recognize a style sheet by a name, it uses the key (in the `sheets.map' file, with the option `-E', etc.).
As an example, C++ is implemented in a file called `cpp.ssh', in which the name is declared to be `C++'.
The rationale is that not every system accepts any character in the file name (e.g., no `+' in MS-DOS). Moreover, it allows to make symbolic links on the ssh files (e.g., `ln -s cpp.ssh c++.ssh' let's you use `-E c++').
ssh files can include the name of its author, a version number, a documentation note and a requirement on the version of a2ps. For instance, if a style sheet requires a2ps version 4.9.6, then a2ps version 4.9.5 will reject it.
a2ps needs to know the beginning and the end of a word, especially keywords. Hence it needs two alphabets: the first one specifying by which letters an identifier can begin, and the second one for the rest of the word. If you prefer, a keyword starts with a character belonging to the first alphabet, and a character not pertaining to the second is a separator.
If the style is case insensitive, then matching is case insensitive (keywords, operators and sequences).
A P-rule (Pretty printing rule), or rule for short, is a structure which consists of two items:
Just a short example: `(foo, bar, Keyword_strong)' as a rule
means that every input occurrence of `foo' will be replaced by
`bar', written with the Keyword_strong face.
If the destination string is empty, then a2ps will use the source string. This is different from giving the source string as a destination string if the case is different. An example will make it fairly clear.
Let foobar be a case insensitive style sheet including the
rules `(foo, "", Keyword)' and `(bar, bar, Keyword)'. Then,
on the input `FOO BAR', a2ps will produce `FOO bar' in
Keyword.
a2ps implements two different ways to match a string. The difference
comes from that some keywords are sensitive to the delimiters around
them (such as `unsigned' and `int' in C, which are
definitely not the same thing as `unsignedint'), and others not (in
C, `!=' is "different from" both in `a != b' and
`a!=b').
The first ones are called keywords in a2ps jargon, and the seconds are operators. Operators are matched anywhere they appear, while keywords need to have separators around them (see section 7.5.3 Alphabets).
Let us give a more complicated example: that of the Yacc rules.
A rule in Yacc is of the form:
a_rule : part1 part2 ;
Suppose you want to highlight these rules. To recognize them, you will write a regular expression specifying that:
The regexp you want is: `/^[a-zA-Z0-9_]*[\t ]*:/'. But with the rule
/^[a-zA-Z0-9_]*[\t ]*:/, "", Label_strong
the blanks and the colon are highlighted too. Hence you need to specify some parts in the regexp (see section `Back-reference Operator' in Regex manual), and use a longer list of destination strings. The correct rule is
(/^\\([a-zA-Z0-9_]*\\)\\([\t ]*:\\)/, \1 Label_strong, \2 Plain)
Since it is a bit painful to read, regexps can be spread upon several lines. It is strongly suggested to break them by groups, and to document the group:
(/^\\([a-zA-Z0-9_]*\\)/ # \1. Name of the rule /\\([\t ]*:\\)/ # \2. Trailing space and colon \1 Label_strong, \2 Plain)
A sequence is a string between two markers, along with a list of exceptions. A marker is a fixed string. Typical examples are comments, string (with usually `"' as opening and closing markers, and `\\' and `\"' as exceptions) etc. Three faces are used: one for the initial marker, one for the core of the sequence, and a last one for the final maker.
There are two levels of pretty-printing encoded in the style sheets. By default, a2ps uses the first level, called normal, unless the option `-g' is specified, in which case, heavy highlighting is invoked, i.e., optional keywords, operators and sequences are considered.
In the previous section (see section 7.5 Style Sheets Semantics) were explained the various items needed to understand the machinery involved in pretty printing. Here, their implementation, i.e., how to write a style sheet file, is explained. The next section (see section 7.7 A Tutorial on Style Sheets), exposes a step by step simple example.
Here are the lexical rules underlying the style sheet language:
alphabet,alphabets,are,case,documentation,end,exceptions,first,in,insensitive,is,keywords,operators,optional,second,sensitive,sequences,style
Comment,Comment_strong,Encoding,Error,Index1,Index2,Index3,Index4,Invisible,Keyword,Keyword_strong,Label,Label_strong,Plain,String,Symbol,Tag1,Tag2,Tag3,Tag4
C-char,C-string
It is a good idea to print the style sheet `symbols.ssh' to see them:
---,\Alpha,\Beta,\Chi,\Delta,\Downarrow,\Epsilon,\Eta,\Gamma,\Im,\Iota,\Kappa,\Lambda,\Leftarrow,\Leftrightarrow,\Mu,\Nu,\Omega,\Omicron,\Phi,\Pi,\Psi,\Re,\Rho,\Rightarrow,\Sigma,\Tau,\Theta,\Uparrow,\Upsilon,\Xi,\Zeta,\aleph,\alpha,\angle,\approx,\beta,\bullet,\cap,\carriagereturn,\cdot,\chi,\circ,\clubsuit,\cong,\copyright,\cup,\delta,\diamondsuit,\div,\downarrow,\emptyset,\epsilon,\equiv,\eta,\exists,\florin,\forall,\gamma,\geq,\heartsuit,\in,\infty,\int,\iota,\kappa,\lambda,\langle,\lceil,\ldots,\leftarrow,\leftrightarrow,\leq,\lfloor,\mu,\nabla,\neq,\not,\not\in,\not\subset,\nu,\omega,\omicron,\oplus,\otimes,\partial,\perp,\phi,\pi,\pm,\prime,\prod,\propto,\psi,\radicalex,\rangle,\rceil,\register,\rfloor,\rho,\rightarrow,\sigma,\sim,\spadesuit,\subset,\subseteq,\suchthat,\sum,\supset,\supseteq,\surd,\tau,\theta,\therefore,\times,\trademark,\uparrow,\upsilon,\varUpsilon,\varcopyright,\vardiamondsuit,\varphi,\varpi,\varregister,\varsigma,\vartheta,\vartrademark,\vee,\wedge,\wp,\xi,\zeta
a2ps symbols.ssh
C escaping mechanism is used.
C escaping mechanism is used. Regexps can be
split in several parts, a` la C strings (i.e., `/part 1/ /part
2/').
The definition of the name of the style sheet is:
stylenameis# body of the style sheetendstyle
The following constructions are optional:
version
version is version-number
written
written by authorsGiving your email is useful for bug reports about style sheets.
written by "Some Body <Some.Body@some.whe.re>"
requires
requires a2ps a2ps-version-number
documentation
documentation is strings end documentationstrings may be a list of strings, without comas, in which case new lines are automatically inserted between each item. See section 5.1 Documentation Format, for details on the format. Please, write useful comments, not `This style is devoted to C files', since the name is here for that, nor `Report errors to mail@me.somewhere', since
written by is there for that.
documentation is
"Not all the keywords are used, to avoid too much"
"bolding. Heavy highlighting (code(-g)code), covers"
"the whole language."
end documentation
There are two things a2ps needs to know: what is symbol consistent, and whether the style is case insensitive.
alphabet
first alphabet is string second alphabet is stringIf both are identical, you may use the shortcut
alphabets are stringThe default alphabets are
first alphabet is "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_" second alphabet is "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ_\ 0123456789"Note that it is on purpose that no characters interval are used.
case
case insensitive # e.g., C, C++ etc. case sensitive # e.g., Perl, Sather, Java etc.The default is
case insensitive.
It is possible to extend an existing style. The syntax is:
ancestors are ancestor_1[, ancestor_2...] end ancestors
where ancestor1 etc. are style sheet keys.
For semantics, the rules are the following:
As an example, both C++ and Objective C style sheets
extend the C style sheet:
style "Objective C" is #[...] ancestors are c end ancestors #[...] end style
To the biggest surprise of the author, mutually dependent style sheets do work!
See section 7.5.5 P-Rules, for the definition of P-rule.
Because of various short cuts, there are many ways to declare a rule:
rules ::= rule_1 `,' rule_2...
rule ::= `(' lhs rhs `)'
| lhs srhs ;
lhs ::= string | regex ;
rhs ::= srhs `,' ...
srhs ::= latex-keyword | expansion face
expansion ::= string | `\'num | <nothing>;
face ::= face-keyword | <nothing>;
The rules are the following:
#define RE_SYNTAX_A2PS \ (RE_SYNTAX_EMACS | RE_CHAR_CLASSES | RE_INTERVALS)See section `Regular Expression Syntax' in Regex manual, for detailed description of the regular expressions used.
Keyword.
PLAIN is used.
Basically, keywords and operators are lists of rules. The syntax is:
keywords are rules end keywords
or
keywords in face-keyword are rules end keywords
in which case the default face is set to face-keyword.
As an example:
keywords in Keyword_strong are /foo*/, "bar" "BAR" Keyword, -> \rightarrow end keywords
is valid.
The syntax for the operators is the same, and both constructs can be
qualified with an optional flag, in which case they are taken
into account in the heavy highlighting mode (see section 3.1.7 Pretty Printing Options).
This is an extract of the C style sheet:
optional operators are -> \rightarrow, && \wedge, || \vee, != \neq, == \equiv, # We need to protect these, so that <= is not replaced in <<= <<=, >>=, <= \leq, >= \geq, ! \not end operators
Note how `<<=' and `>>=' are protected (there are defined to be written as is when met in the source). This is to prevent the two last characters of `<<=' from being converted into a `less or equal' sign.
The order in which you define the elements of a category (but the
sequences) does not matter. But since a2ps sorts them at run time, it
may save time if the alphabetical C-order is more or less
followed.
You should be aware that when declaring a keyword with a regular expression as lhs, then a2ps automatically makes this expression matching only if there are no character of the first alphabet both just before, and just after the string.
In term of implementation, it means that
keywords are /foo\\|bar/ end keywords
is exactly the same as
operators are /\\b\\(foo\\|bar\\)\\b/ end operators
This can cause problems if you use anchors (e.g. $, or ^)
in keywords: the matcher will be broken. In this particular case,
define your keywords as operators, taking care of the `\\b' by
yourself.
See section `Match-word-boundary Operator' in Regex manual, for details on `\b'.
Sequences admit several declarations too:
sequences ::= sequences are
sequence_1 `,' sequence_2...
end sequences
sequence ::= rule in_face close_opt exceptions_opt
| C-string
| C-char
;
close_opt ::= rule
| closers are
rules
end closers
| <nothing>
;
exceptions_opt ::= exceptions are
rules
end exceptions
| <nothing>
;
The rules are:
As a first example, the definition of C-string is:
sequences are
"\"" Plain String "\"" Plain
exceptions are
"\\\\", "\\\""
end exceptions
end sequences
The following example comes from `ssh.ssh', the style sheet for
style sheet files, in which there are two kinds of pseudo-strings: the
strings (`"example"'), and the regular expressions
(`/example/'). We do not want the content of the pseudo-strings in
the face String.
sequences are
# The comments
"#" Comment,
# The name of the style sheet
"style " Keyword_strong (Label + Index1) " is" Keyword_strong,
# Strings are exactly the C-strings, though we don't want to
# have them in the "string" face
"\"" Plain "\""
exceptions are
"\\\\", "\\\""
end exceptions,
# Regexps
"/" Plain "/"
exceptions are
"\\\\", "\\\/"
end exceptions
end sequences
The order between sequences does matter. For instance in Java, `/**' introduces strong comments, and `/*' comments. `/**' must be declared before `/*', or it will be hidden.
There are actually some sequences that could have been implemented as
operators with a specific regular expression (that goes up to the
closer). Nevertheless be aware of a big difference: regular expression
are applied to a single line of the source file, hence, they cannot
match on several lines. For instance, the C comments,
/* * a comment */
cannot be implemented with operators, though C++ comments can:
// // a comment //
Once your style sheet is written, you may want to let a2ps perform simple tests on it (e.g., checking there are no rules involving upper case characters in a case insensitive style sheet, etc.). These tests are performed when verbosity includes the style sheets.
you may also want to use the special convention that when a style sheet is required with a suffix, then a2ps will not look at it in its library path, but precisely from when you are.
Suppose for instance you extended the `c.ssh' style sheet, which is in the current directory, and is said case insensitive. Run
ubu $ a2ps foo.c -Ec.ssh -P void -v sheets # Long output deleted Checking coherence of "C" (c.ssh) a2ps: c.ssh:`FILE' uses upper case characters a2ps: c.ssh:`NULL' uses upper case characters "C" (c.ssh) is corrupted. ---------- End of Finalization of c.ssh
Here, it is clear that C is not case insensitive.
In this section a simple example of style sheet is entirely covered: that of `ChangeLog' files.
`ChangeLog' files are some kind of memory of changes done to files, so that various programmers can understand what happened to the sources. This helps a lot, for instance, in guessing what recent changes may have introduced new bugs.
First of all, here is a sample of a `ChangeLog' file, taken from the `misc/' directory of the original a2ps package:
Sun Apr 27 14:29:22 1997 Akim Demaille <demaille@inf.enst.fr>
* base.ps: Merged in color.ps, since now a lot is
common [added box and underline features].
Fri Apr 25 14:05:20 1997 Akim Demaille <demaille@inf.enst.fr>
* color.ps: Added box and underline routines.
Mon Mar 17 20:39:11 1997 Akim Demaille <demaille@gargantua.enst.fr>
* base.ps: Got rid of CourierBack and reencoded_backspace_font.
Now the C has to handle this by itself.
Sat Mar 1 19:12:22 1997 Akim Demaille <demaille@gargantua.enst.fr>
* *.enc: they build their own dictionaries, to ease multi
lingual documents.
The syntax is really simple: A line specifying the author and the date of the changes, then a list of changes, all of them starting with an star followed by the name of the files concerned, then optionally between parentheses the functions affected, and then some comments.
Quite naturally the style will be called ChangeLog, hence:
style ChangeLog is written by "Akim Demaille <demaille@inf.enst.fr>" version is 1.0 requires a2ps 4.9.5 documentation is "This is a tutorial style sheet.\n" end documentation ... end style
A first interesting and easy entry is that of function names, between `(' and `)':
sequences are
"(" Plain Label ")" Plain
end sequences
A small problem that may occur is that there can be several functions mentioned separated by commas, that we don't want to highlight this way. Commas, here, are exceptions. Since regular expressions are not yet implemented in a2ps, there is a simple but stupid way to avoid that white spaces are all considered as part of a function name, namely defining two exceptions: one which captures a single comma, and a second, capturing a comma and its trailing space.
For the file names, the problem is a bit more delicate, since they may end with `:', or when starts the list of functions. Then, we define two sequences, each one with one of the possible closers, the exceptions being attached to the first one:
sequences are
"* " Plain Label_strong ":" Plain
exceptions are
", " Plain, "," Plain
end exceptions,
"* " Plain Label_strong " " Plain
end sequences
Finally, let us say that some words have a higher importance in the core of text: those about removing or adding something.
keywords in Keyword_strong are add, added, remove, removed end keywords
Since they may appear in lower or upper, of mixed case, the style will be defined as case insensitive.
Finally, we end up with this style sheet file, in which an optional highlighting of the mail address of the author is done. Saving the file is last step. But do not forget that a style sheet has both a name as nice as you may want (such as `Common Lisp'), and a key on which there are strict rules: the prefix must be alpha-numerical, lower case, with no more than 8 characters. Let's chose `chlog.ssh'.
# This is a tutorial on a2ps' style sheets
style ChangeLog is
written by "Akim Demaille <demaille@inf.enst.fr>"
version is 1.0
requires a2ps 4.9.5
documentation is
"Second level of high lighting covers emails."
end documentation
sequences are
"(" Plain Label ")" Plain
exceptions are
", " Plain, "," Plain
end exceptions,
"* " Plain Label_strong ":" Plain
exceptions are
", " Plain, "," Plain
end exceptions,
"* " Plain Label_strong " " Plain
end sequences
keywords in Keyword_strong are
add, added, remove, removed
end keywords
optional sequences are
< Plain Keyword > Plain
end sequences
end style
As a last step, you may which to let a2ps check your style sheet, both its syntax, and common errors:
ubu $ a2ps -vsheet -E/tmp/chlog.ssh ChangeLog -P void Long output deleted Checking coherence of "ChangeLog" (/tmp/chlog.ssh) "ChangeLog" (/tmp/chlog.ssh) is sane. ---------- End of Finalization of /tmp/chlog.ssh
It's all set, your style sheet is ready!
The last touch is to include the pattern rules about `ChangeLog' files (which could appear as `ChangeLog.old' etc.) in `sheets.map':
# ChangeLog files ChangeLog* chlog
This won't work... Well, not always. Not for instance if you print `misc/ChangeLog'. This is not a bug, but truly a feature, since sometimes one gets more information about the type of a file from its path, than from the file name.
Here, to match the preceding path that may appear, just use `*':
# ChangeLog files *ChangeLog* chlog
If you want to be more specific (`FooChangeLog' should not match), use:
# ChangeLog files ChangeLog* chlog */ChangeLog* chlog
The example we have presented until now uses only basic features, and does not take advantage of the regexp. In this section we should how to write more evolved pretty printing rules.
The target will be the lines like:
Sun Apr 27 14:29:22 1997 Akim Demaille <demaille@inf.enst.fr> Fri Apr 25 14:05:20 1997 Akim Demaille <demaille@inf.enst.fr>
There are three fields: the date, the name, the mail. These lines all start at the beginning of line. The last field is the easier to recognize: is starts with a `<', and finishes with a `>'. Its rule is then `/<[^>]+>/'. It is now easier to specify the second: it is composed only of words, at least one, separated by blanks, and is followed by the mail: `/[[:alpha:]]+\\([ \t]+[[:alpha:]]+\\)*/'. To concatenate the two, we introduce optional blanks, and we put each one into a pair of `\\('-`\\)' to make each one a recognizable part:
\\([[:alpha:]]+\\([ \t]+[[:alpha:]]+\\)*\\) \\(.+\\) \\(<[^>]+>\\)
Now the first part is rather easy: it starts at the beginning of the line, finishes with a digit. Once again, it is separated from the following field by blanks. Split by groups (see section `Grouping Operators' in Regex manual), we have:
^ \\([^\t ].*[0-9]\\) \\([ \t]+\\) \\([[:alpha:]]+\\([ \t]+[[:alpha:]]+\\)*\\) \\(.+\\) \\(<[^>]+>\\)
Now the destination is composed of back references to those groups, together with a face:
# We want to highlight the date and the maintainer name optional operators are (/^\\([^\t ].*[0-9]\\)/ # \1. The date /\\([ \t]+\\)/ # \2. Spaces /\\([[:alpha:]]+\\([ \t]+[[:alpha:]]+\\)*\\)/ # \3. Name /\\(.+\\)/ # \5. space and < /\\(<[^>]+\\)>/ # \6. email \1 Keyword, \2 Plain, \3 Keyword_strong, \5 Plain, \6 Keyword, > Plain) end operators
Notice the way regexps are written, to ease reading.
This section is meant for people who wish to contribute style sheets. There is a couple of additional constraints, explained here.
Finally, make sure your style sheet behaves well! (see section 7.6.8 Checking a Style Sheet)
Go to the first, previous, next, last section, table of contents.