5736 lines
162 KiB
Plaintext
5736 lines
162 KiB
Plaintext
\input texinfo
|
|
@comment %**start of header
|
|
@setfilename recutils.info
|
|
@include version.texi
|
|
@settitle GNU Recutils
|
|
@afourpaper
|
|
@comment %**end of header
|
|
|
|
|
|
@comment Latin: videre licet,
|
|
@macro viz
|
|
@i{viz:@:}
|
|
@end macro
|
|
|
|
@comment Latin: id est
|
|
@macro ie
|
|
@i{i.e.@:}
|
|
@end macro
|
|
|
|
@comment Latin: exempli gratia
|
|
@macro eg
|
|
@i{e.g.@:}
|
|
@end macro
|
|
|
|
@comment Latin: et cetera
|
|
@macro etc
|
|
@i{etc.@:}
|
|
@end macro
|
|
|
|
|
|
@copying
|
|
This manual is for GNU recutils (version @value{VERSION},
|
|
@value{UPDATED}).
|
|
|
|
Copyright @copyright{} 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016,
|
|
2017, 2018, 2019, 2020, 2022 Jose E. Marchesi
|
|
|
|
Copyright @copyright{} 1994, 1995, 1996, 1997, 1998, 1999, 2000, 2001,
|
|
2002, 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013,
|
|
2014, 2020, 2022 Free Software Foundation, Inc.
|
|
|
|
@quotation
|
|
Permission is granted to copy, distribute and/or modify this document
|
|
under the terms of the GNU Free Documentation License, Version 1.3 or
|
|
any later version published by the Free Software Foundation; with no
|
|
Invariant Sections, no Front-Cover Texts, and no Back-Cover Texts. A
|
|
copy of the license is included in the section entitled ``GNU Free
|
|
Documentation License''.
|
|
@end quotation
|
|
@end copying
|
|
|
|
@dircategory Database
|
|
@direntry
|
|
* recutils: (recutils). The GNU Recutils manual.
|
|
@end direntry
|
|
|
|
@dircategory Individual utilities
|
|
@direntry
|
|
* recinf: (recutils)Invoking recinf. Get info about recfiles.
|
|
* recsel: (recutils)Invoking recsel. Read records.
|
|
* recins: (recutils)Invoking recins. Insert records.
|
|
* recdel: (recutils)Invoking recdel. Delete records.
|
|
* recset: (recutils)Invoking recset. Manage fields.
|
|
* recfix: (recutils)Invoking recfix. Fix recfiles.
|
|
* csv2rec: (recutils)Invoking csv2rec. CSV to recfiles.
|
|
* rec2csv: (recutils)Invoking rec2csv. Recfiles to CSV.
|
|
* mdb2rec: (recutils)Invoking mdb2rec. MDB to recfiles.
|
|
@end direntry
|
|
|
|
@titlepage
|
|
@title GNU recutils
|
|
@subtitle for version @value{VERSION}, @value{UPDATED}
|
|
@author by Jose E. Marchesi and John Darrington
|
|
@page
|
|
@vskip 0pt plus 1filll
|
|
@insertcopying
|
|
@end titlepage
|
|
|
|
@contents
|
|
|
|
@ifnottex
|
|
@node Top
|
|
@top GNU Recutils
|
|
|
|
This manual documents version @value{VERSION} of the GNU recutils.
|
|
|
|
@insertcopying
|
|
@end ifnottex
|
|
|
|
@menu
|
|
The Basics
|
|
* Introduction:: Introducing recutils.
|
|
* The Rec Format:: Writing recfiles.
|
|
|
|
Using the Recutils
|
|
* Querying Recfiles:: Extracting data from recfiles.
|
|
* Editing Records:: Inserting and deleting records.
|
|
* Editing Fields:: Inserting, modifying and deleting fields.
|
|
|
|
Data Integrity
|
|
* Field Types:: Restrictions on the values of fields.
|
|
* Constraints on Record Sets:: Requiring or forbidding specific fields.
|
|
* Checking Recfiles:: Making sure the data is ok.
|
|
|
|
Advanced Topics
|
|
* Remote Descriptors:: Implementing distributed databases.
|
|
* Grouping and Aggregates:: Statistics.
|
|
* Queries which Join Records:: Crossing record of different types.
|
|
* Auto-Generated Fields:: Counters and time-stamps.
|
|
* Encryption:: Storing sensitive information.
|
|
* Generating Reports:: Formatted output with templates.
|
|
* Interoperability:: Importing and exporting to other formats.
|
|
* Bash Builtins:: Boosting the recutils in the shell.
|
|
|
|
Reference Material
|
|
* Invoking the Utilities:: Exhaustive list of command line arguments.
|
|
* Using ob-rec.el:: Invoking Recutils from Emacs Org-mode source blocks.
|
|
* Regular Expressions:: Flavor of regexps supported in recutils.
|
|
* Date input formats:: Specifying dates and times.
|
|
|
|
* GNU Free Documentation License:: Distribution terms for this document.
|
|
|
|
Indexes
|
|
* Concept Index::
|
|
|
|
@detailmenu
|
|
--- The Detailed Node Listing ---
|
|
---------------------------------
|
|
|
|
Here are some other nodes which are really subnodes of the ones
|
|
already listed, mentioned here so you can get to them in one step:
|
|
|
|
Introduction
|
|
|
|
* Purpose:: Why recutils.
|
|
* A Little Example:: Recutils in action.
|
|
|
|
The Rec Format
|
|
|
|
* Fields:: The key--value pairs which comprise the data.
|
|
* Records:: The main entities of a recfile.
|
|
* Comments:: Information for humans' benefit only.
|
|
* Record Descriptors:: Describing different types of records.
|
|
|
|
Querying Recfiles
|
|
|
|
* Simple Selections:: Introducing @command{recsel}.
|
|
* Selecting by Type:: Get the records of some given type.
|
|
* Selecting by Position:: Get the record occupying some position.
|
|
* Random Records:: Get a set of random records.
|
|
* Selection Expressions:: Get the records satisfying some expression.
|
|
* Field Expressions:: Selecting a subset of fields.
|
|
* Sorted Output:: Get the records in a given order.
|
|
|
|
Editing Records
|
|
|
|
* Inserting Records:: Inserting data into recfiles.
|
|
* Deleting Records:: Removing entries.
|
|
* Sorting Records:: Physical reordering of records.
|
|
|
|
Editing Fields
|
|
|
|
* Setting Fields:: Editing field values.
|
|
* Adding Fields:: Adding new fields to records.
|
|
* Deleting Fields:: Removing or commenting-out fields.
|
|
|
|
Field Types
|
|
|
|
* Declaring Types:: Declaration of types in record descriptors.
|
|
* Types and Fields:: Associating fields with types.
|
|
* Scalar Field Types:: Numbers and ranges.
|
|
* String Field Types:: Lines, limited strings and regular expressions.
|
|
* Enumerated Field Types:: Enumerations and boolean values.
|
|
* Date and Time Types:: Dates and times.
|
|
* Other Field Types:: Emails, fields, UUIDs, @dots{}
|
|
|
|
Constraints on Record Sets
|
|
|
|
* Mandatory Fields:: Requiring the presence of fields.
|
|
* Prohibited Fields:: Forbidding the presence of fields.
|
|
* Allowed Fields:: Restricting the presence of fields.
|
|
* Keys and Unique Fields:: Fields characterizing records.
|
|
* Size Constraints:: Limiting the size of a record set.
|
|
* Arbitrary Constraints:: Constraints records must comply with.
|
|
|
|
Checking Recfiles
|
|
|
|
* Syntactical Errors:: Fixing structure errors in recfiles.
|
|
* Semantic Errors:: Fixing semantic errors in recfiles.
|
|
|
|
Grouping and Aggregates
|
|
|
|
* Grouping Records:: Combining records by fields.
|
|
* Aggregate Functions:: Statistics and more.
|
|
|
|
Joins
|
|
|
|
* Foreign Keys:: Referring records from another records.
|
|
* Joining Records:: Performing cross-joins.
|
|
|
|
Auto-Generated Fields
|
|
|
|
* Counters:: Generating incremental Ids.
|
|
* Unique Identifiers:: Generating universally unique Ids.
|
|
* Time-Stamps:: Tracking the creation of records.
|
|
|
|
Encryption
|
|
|
|
* Confidential Fields:: Declaring fields as sensitive data.
|
|
* Encrypting Files:: Encrypt confidential fields.
|
|
* Decrypting Data:: Reading encrypted fields.
|
|
|
|
Generating Reports
|
|
|
|
* Templates:: Formatted output.
|
|
|
|
Interoperability
|
|
|
|
* CSV Files:: Converting recfiles to/from csv files.
|
|
* Importing MDB Files:: Importing MS Access Databases.
|
|
|
|
Bash Builtins
|
|
|
|
* readrec:: Exporting the contents of records to the shell.
|
|
|
|
Invoking the Utilities
|
|
|
|
* Invoking recinf:: Printing information about rec files.
|
|
* Invoking recsel:: Selecting records.
|
|
* Invoking recins:: Inserting records.
|
|
* Invoking recdel:: Deleting records.
|
|
* Invoking recset:: Managing fields.
|
|
* Invoking recfix:: Fixing broken rec files, and diagnostics.
|
|
* Invoking recfmt:: Formatting records using templates.
|
|
* Invoking csv2rec:: Converting csv data into rec data.
|
|
* Invoking rec2csv:: Converting rec data into csv data.
|
|
* Invoking mdb2rec:: Converting mdb files into rec files.
|
|
|
|
@end detailmenu
|
|
@end menu
|
|
|
|
@node Introduction
|
|
@chapter Introduction
|
|
|
|
@menu
|
|
* Purpose:: Why recutils.
|
|
* A Little Example:: Recutils in action.
|
|
@end menu
|
|
|
|
@node Purpose
|
|
@section Purpose
|
|
|
|
GNU recutils is a set of tools and libraries to access human-editable,
|
|
text-based databases called @emph{recfiles}. The data is stored as a
|
|
sequence of records, each record containing an arbitrary number of
|
|
named fields. Advanced capabilities usually found in other data
|
|
storage systems are supported: data types, data integrity (keys,
|
|
mandatory fields, @etc{}) as well as the ability of records to refer to
|
|
other records (sort of foreign keys). Despite its simplicity,
|
|
recfiles can be used to store medium-sized databases.
|
|
|
|
So, yet another data storage system? The mere existence of this
|
|
package deserves an explanation. There is a rich set of already
|
|
available free data storage systems, covering a broad range of
|
|
requirements. Big systems having complex data storage requirements
|
|
will probably make use of some full-fledged relational system such as
|
|
MySQL or PostgreSQL@. Less demanding applications, or applications
|
|
with special deployment requirements, may find it more convenient to
|
|
use a simpler system such as SQLite, where the data is stored in a
|
|
single binary file. XML files are often used to store configuration
|
|
settings for programs, and to encode data for transmission through
|
|
networks.
|
|
|
|
So it looks like all the needs are covered by the existing
|
|
solutions @dots{} but consider the following characteristics of the
|
|
data storage systems mentioned in the previous paragraph:
|
|
|
|
@itemize @minus
|
|
@item The stored data is not directly human readable.
|
|
@item The stored data is definitely not directly writable by humans.
|
|
@item They are program dependent.
|
|
@item They are not easily managed by version control systems.
|
|
@end itemize
|
|
|
|
@cindex readability
|
|
Regarding the first point (human readability), while it is clearly
|
|
true for the binary files, some may argue XML files are indeed human
|
|
readable@dots{} well@dots{} @code{<bar><foo tag="val">try</foo> to r&iamp;ead
|
|
<p>this</p></bar>}. YAML @footnote{Yet Another Markup Language} is an
|
|
example of a hierarchical data storage format which is much more
|
|
readable than XML@. The problem with YAML is that it was designed as a
|
|
``data serialization language'' and thus to map the data constructs
|
|
usually found in programming languages. That makes it too complex for
|
|
the simple task of storing plain lists of items.
|
|
|
|
Recfiles are human-readable, human-writable and still easy to
|
|
parse and to manipulate automatically. Obviously they are not
|
|
suitable for any task (for example, it can be difficult to manage
|
|
hierarchies in recfiles) and performance is somewhat sacrificed in
|
|
favor of readability. But they are quite handy to store small to
|
|
medium simple databases.
|
|
|
|
The GNU recutils suite comprises:
|
|
|
|
@itemize @minus
|
|
@item This Texinfo manual, describing the Rec format and the accompanying software.
|
|
@item A C library (librec) that provides a rich set of functions to manipulate rec data.
|
|
@item A set utilities that can be used in shell scripts and in the command line to operate on rec files.
|
|
@item An emacs mode, @code{rec-mode}.
|
|
@end itemize
|
|
|
|
@node A Little Example
|
|
@section A Little Example
|
|
|
|
@cindex books
|
|
Everyone loves to grow a nice book collection at home. Unfortunately,
|
|
in most cases the management of our private books gets uncontrolled:
|
|
some books get lost, some of them may be loaned to some friend, there
|
|
are some duplicated (or even triplicated!) titles because we forgot
|
|
about the existence of the previous copy, and many more details.
|
|
|
|
In order to improve the management of our little book collection we
|
|
could make use of a complex data storage system such as a relational
|
|
database. The problem with that approach, as explained in the
|
|
previous section, is that the tool is too complicated for the simple
|
|
task: we do not need the full power of a relational database system to
|
|
maintain a simple collection of books.
|
|
|
|
With GNU recutils it is possible to maintain such a little database in
|
|
a text file. Let's call it @file{books.rec}. The following table
|
|
resumes the information items that we want to store for each title,
|
|
along with some common-sense restrictions.
|
|
|
|
@itemize @minus
|
|
@item
|
|
Every book has a title, even if it is ``No Title''.
|
|
@item
|
|
A book can have several titles.
|
|
@item
|
|
A book can have more than one author.
|
|
@item
|
|
For some books the author is not known.
|
|
@item
|
|
Sometimes we don't care about who the author of a book is.
|
|
@item
|
|
We usually store our books at home.
|
|
@item
|
|
Sometimes we loan books to friends.
|
|
@item
|
|
On occasions we lose track of the physical location of a book. Did
|
|
we loan it to anyone? Was it lost in the last move? Is it in some
|
|
hidden place at home?
|
|
@end itemize
|
|
|
|
@noindent
|
|
The contents of the rec file follows:
|
|
|
|
@example
|
|
# -*- mode: rec -*-
|
|
|
|
%rec: Book
|
|
%mandatory: Title
|
|
%type: Location enum loaned home unknown
|
|
%doc:
|
|
+ A book in my personal collection.
|
|
|
|
Title: GNU Emacs Manual
|
|
Author: Richard M. Stallman
|
|
Publisher: FSF
|
|
Location: home
|
|
|
|
Title: The Colour of Magic
|
|
Author: Terry Pratchett
|
|
Location: loaned
|
|
|
|
Title: Mio Cid
|
|
Author: Anonymous
|
|
Location: home
|
|
|
|
Title: chapters.gnu.org administration guide
|
|
Author: Nacho Gonzalez
|
|
Author: Jose E. Marchesi
|
|
Location: unknown
|
|
|
|
Title: Yeelong User Manual
|
|
Location: home
|
|
|
|
# End of books.rec
|
|
@end example
|
|
|
|
Simple. The file contains a set of records separated by blank lines.
|
|
Each record comprises a set of fields with a name and a value.
|
|
|
|
The GNU recutils can then be used to access the contents of the file.
|
|
For example, we could get a list of the names of loaned books by invoking
|
|
@command{recsel} in the following way:
|
|
|
|
@example
|
|
$ recsel -e "Location = 'loaned'" -P Title books.rec
|
|
The Colour of Magic
|
|
@end example
|
|
|
|
@node The Rec Format
|
|
@chapter The Rec Format
|
|
|
|
A recfile is nothing but a text file which conforms to a few simple
|
|
rules. This chapter shows you how, by observing these rules, recfiles
|
|
of arbitrary complexity can be written.
|
|
|
|
@menu
|
|
* Fields:: The key--value pairs which comprise the data.
|
|
* Records:: The main entities of a recfile.
|
|
* Comments:: Information for humans' benefit only.
|
|
* Record Descriptors:: Describing different types of records.
|
|
@end menu
|
|
|
|
@node Fields
|
|
@section Fields
|
|
|
|
@cindex field
|
|
A @dfn{field} is the written form of an association between a label
|
|
and a value. For example, if we wanted to associate the label
|
|
@code{Name} with the value @code{Ada Lovelace} we would write:
|
|
|
|
@example
|
|
Name: Ada Lovelace
|
|
@end example
|
|
|
|
The separator between the field name and the field value is a colon
|
|
followed by a blank character (space and tabs, but not newlines). The
|
|
name of the field shall begin in the first column of the line.
|
|
|
|
@cindex field name
|
|
A @dfn{field name} is a sequence of alphanumeric characters plus
|
|
underscores (@code{_}), starting with a letter or the character
|
|
@code{%}. The regular expression denoting a field name is:
|
|
|
|
@example
|
|
[a-zA-Z%][a-zA-Z0-9_]*
|
|
@end example
|
|
|
|
@cindex case, in field names
|
|
Field names are case-sensitive. @code{Foo} and @code{foo} are
|
|
different field names.
|
|
|
|
The following list contains valid field names (the final colon is not
|
|
part of the names):
|
|
|
|
@example
|
|
Foo:
|
|
foo:
|
|
A23:
|
|
ab1:
|
|
A_Field:
|
|
@end example
|
|
|
|
@cindex field values
|
|
The @dfn{value of a field} is a sequence of characters terminated by a
|
|
single newline character (@code{\n}).
|
|
|
|
@cindex multiline field values
|
|
Sometimes a value is too long to fit in the usual width of terminals
|
|
and screens. In that case, depending on the specific tool used to
|
|
access the file, the readability of the data would not be that good.
|
|
It is therefore possible to physically split a logical line by
|
|
escaping a newline with a backslash character, as in:
|
|
|
|
@example
|
|
LongLine: This is a quite long value \
|
|
comprising a single unique logical line \
|
|
split in several physical lines.
|
|
@end example
|
|
|
|
The sequence @code{\n} (newline) @code{+} (PLUS) and an optional
|
|
@code{_} (SPACE) is interpreted as a newline when found in a field
|
|
value. For example, the C string @code{"bar1\nbar2\n bar3"} would be
|
|
encoded in the following way in a field value:
|
|
|
|
@example
|
|
Foo: bar1
|
|
+ bar2
|
|
+ bar3
|
|
@end example
|
|
|
|
@node Records
|
|
@section Records
|
|
|
|
@cindex record
|
|
A @dfn{record} is a group of one or more fields written one after the
|
|
other:
|
|
|
|
@example
|
|
Name1: Value1
|
|
Name2: Value2
|
|
Name2: Value3
|
|
@end example
|
|
|
|
It is possible for several fields in a record to share the same name
|
|
or/and the field value. The following is a valid record containing
|
|
three fields:
|
|
|
|
@example
|
|
Name: John Smith
|
|
Email: john.smith@@foomail.com
|
|
Email: john@@smith.name
|
|
@end example
|
|
|
|
@cindex record size
|
|
@cindex size, record size
|
|
The @dfn{size of a record} is defined as the number of fields that it
|
|
contains. A record cannot be empty, so the minimum size
|
|
for a record is 1. The maximum number of fields for a record is only
|
|
limited by the available physical resources. The size of the previous
|
|
record is 3.
|
|
|
|
Records are separated by one or more blank lines. For instance, the
|
|
following example shows a file named @file{personalities.rec}
|
|
featuring three records:
|
|
|
|
@example
|
|
Name: Ada Lovelace
|
|
Age: 36
|
|
|
|
Name: Peter the Great
|
|
Age: 53
|
|
|
|
Name: Matusalem
|
|
Age: 969
|
|
@end example
|
|
|
|
@node Comments
|
|
@section Comments
|
|
|
|
@cindex comments
|
|
Any line having an @code{#} (ASCII 0x23) character in the first column
|
|
is a comment line.
|
|
|
|
Comments may be used to insert information that
|
|
is not part of the database but useful in other ways.
|
|
They are completely ignored by processing tools and can only be seen by
|
|
looking at the recfile itself.
|
|
|
|
It is also quite convenient to comment-out information from the
|
|
recfile without having to remove it in a definitive way: you may want
|
|
to recover the data into the database later! Comment lines can be
|
|
used to comment-out both full registers and single fields:
|
|
|
|
@example
|
|
Name: Jose E. Marchesi
|
|
# Occupation: Software Engineer
|
|
# Severe lack of brain capacity
|
|
# Fired on 02/01/2009 (without compensation)
|
|
Occupation: Unoccupied
|
|
@end example
|
|
|
|
Comments are also useful for headers, footers, comment blocks and all
|
|
kind of markers:
|
|
|
|
@example
|
|
# -*- mode: rec -*-
|
|
#
|
|
# TODO
|
|
#
|
|
# This file contains the Bugs database of GNU recutils.
|
|
#
|
|
# Blah blah@dots{}
|
|
|
|
@dots{}
|
|
|
|
# End of TODO
|
|
@end example
|
|
|
|
|
|
Unlike some file formats, comments in recfiles must be complete lines.
|
|
You cannot start a comment in the middle of a line.
|
|
For example, in the following record, the @code{#} does @emph{not} start a comment:
|
|
@example
|
|
Name: Peter the Great # Russian Tsar
|
|
Age: 53
|
|
@end example
|
|
|
|
@node Record Descriptors
|
|
@section Record Descriptors
|
|
|
|
@cindex descriptor
|
|
Certain properties of a set of records can be specified by preceding
|
|
them with a @dfn{record descriptor}. A record descriptor is itself a
|
|
record, and uses fields with some predefined names to store
|
|
properties.
|
|
|
|
@menu
|
|
* Record Sets:: Defining different types of records.
|
|
* Naming Record Types:: Some conventions on naming record sets.
|
|
* Documenting Records:: Documenting your record sets.
|
|
* Record Sets Properties:: Introducing the special fields.
|
|
@end menu
|
|
|
|
@node Record Sets
|
|
@subsection Record Sets
|
|
@cindex record sets
|
|
|
|
The most basic property that can be specified for a set of records is
|
|
their @dfn{type}. The special field name @code{%rec} is used for that
|
|
purpose:
|
|
|
|
@cindex @code{%rec}
|
|
@example
|
|
%rec: Entry
|
|
|
|
Id: 1
|
|
Name: Entry 1
|
|
|
|
Id: 2
|
|
Name: Entry 2
|
|
@end example
|
|
|
|
The records following the descriptors are then identified as having
|
|
its type. So in the example above we would say there are two records
|
|
of type ``Entry''. Or in a more colloquial way we would say there are
|
|
two ``Entries'' in the database.
|
|
|
|
The effect of a record descriptor ends when another descriptor is
|
|
found in the stream of records. This allows you to store different kinds
|
|
of records in the same database. For example, suppose you are
|
|
maintaining a depot. You will need to keep track of both what items
|
|
are available and when they are sold or restocked.
|
|
|
|
The following example shows the usage of two record descriptors to
|
|
store both kind of records: articles and stock.
|
|
|
|
@example
|
|
%rec: Article
|
|
|
|
Id: 1
|
|
Title: Article 1
|
|
|
|
Id: 2
|
|
Title: Article 2
|
|
|
|
%rec: Stock
|
|
|
|
Id: 1
|
|
Type: sell
|
|
Date: 20 April 2011
|
|
|
|
Id: 2
|
|
Type: stock
|
|
Date: 21 April 2011
|
|
@end example
|
|
|
|
The collection of records having same types in recfiles are known as
|
|
@dfn{record sets} in recutils jargon. In the example above two
|
|
record sets are defined: one containing articles and the other
|
|
containing stock movements.
|
|
|
|
Nothing prevents having empty record sets in databases. This is in fact
|
|
usually the case when a new recfile is written but no data exists yet.
|
|
In our depot example we could write a first version of the database
|
|
containing just the record descriptors:
|
|
|
|
@example
|
|
%rec: Article
|
|
|
|
%rec: Stock
|
|
@end example
|
|
|
|
@cindex default record types
|
|
Special records are not required, and many recfiles do not have them.
|
|
This is because
|
|
all the records contained in the file are of the same type, and their
|
|
nature can usually be inferred from both the file name and their
|
|
contents. For example, @file{contacts.rec} could simply contain
|
|
records representing contacts without an explicit @code{%rec: Contact}
|
|
record descriptor. In this case we say that the type of the anonymous
|
|
records stored in the file is the @dfn{default record type}.
|
|
|
|
Another possible situation, although not usual, is to have a recfile
|
|
containing both non-typed (default) and typed record types:
|
|
|
|
@example
|
|
Id: 1
|
|
Title: Blah
|
|
|
|
Id: 2
|
|
Title: Bleh
|
|
|
|
%rec: Movement
|
|
|
|
Date: 13-Aug-2012
|
|
Concept: 20
|
|
|
|
Date: 24-Sept-2012
|
|
Concept: 12
|
|
@end example
|
|
|
|
@noindent
|
|
In this case the records preceding the movements are of the
|
|
``default'' type, whereas the records following the record descriptor
|
|
are of type @code{Movement}. Even though it is supported by the format
|
|
and the utilities, it is generally not recommended to mix non-typed
|
|
and typed records in a recfile.
|
|
|
|
@node Naming Record Types
|
|
@subsection Naming Record Types
|
|
|
|
It is up to you how to name your record sets. Any string comprising
|
|
only alphanumeric characters or underscores, and that starts with a
|
|
letter will be a legal name. However, it is recommended to use the
|
|
singular form of a noun in order to describe the ``type'' of the
|
|
records in the records set. Examples are @code{Article},
|
|
@code{Contributor}, @code{Employee} and @code{Movement}.
|
|
|
|
The used noun should be specific enough in order to characterize the
|
|
property of the records which matters. For example, in a
|
|
contributor's database it would be better to have a record set named
|
|
@code{Contributor} than @code{Person}.
|
|
|
|
The reason of using singular nouns instead of their plural forms is
|
|
that it works better with the utilities: it is more natural to read
|
|
@command{recsel -t Contributor} (@command{-t} is for ``type'') than
|
|
@command{recsel -t Contributors}.
|
|
|
|
@node Documenting Records
|
|
@subsection Documenting Records
|
|
|
|
@cindex @code{%doc}
|
|
@cindex documentation fields
|
|
@cindex description of record sets
|
|
|
|
As well as a name, it is a good idea to provide a description of the record set.
|
|
This is sometimes called the record set's @dfn{documentation} and is specified
|
|
using the @code{%doc} field.
|
|
|
|
Whereas the name is usually short and can contain only alphanumeric
|
|
characters and underscores, no such restriction applies to the
|
|
documentation. The documentation is typically more verbose than the
|
|
name provided by the @code{%rec} field and may contain arbitrary
|
|
characters such as punctuation and parentheses. It is somewhat
|
|
similar to a comment (@pxref{Comments}), but it can be managed more easily
|
|
in a programmatic way. Unlike a comment, the @code{%doc} field is
|
|
recognized by tools such as @command{recinf} (@pxref{Invoking recinf})
|
|
which processes record descriptors. For example, you might have two
|
|
record sets with @code{%rec} and @code{%doc} fields as follows:
|
|
|
|
@example
|
|
%rec: Contact
|
|
%doc: Family, friends and acquaintances (other than business).
|
|
|
|
Name: Granny
|
|
Phone: +12 23456677
|
|
|
|
Name: Edwina
|
|
Phone: +55 0923 8765
|
|
|
|
|
|
%rec: Associate
|
|
%doc: Colleagues and other business contacts
|
|
|
|
Name: Karl Schmidt
|
|
Phone: +49 88234566
|
|
|
|
Name: Genevieve Curie
|
|
Phone: +33 34 87 65
|
|
@end example
|
|
|
|
@node Record Sets Properties
|
|
@subsection Record Sets Properties
|
|
|
|
@cindex field, special fields
|
|
@cindex special fields
|
|
Besides determining the type of record that follows in the
|
|
stream, record descriptors can be used to describe other properties of
|
|
those records. This can be done by using @dfn{special
|
|
fields}, which have special names from a predefined set.
|
|
Consider for example the following database, where record descriptors
|
|
are used to specify a (optional) numeric `Id' and a mandatory `Title' field:
|
|
|
|
@cindex @code{%mandatory}
|
|
@cindex mandatory fields
|
|
@example
|
|
%rec: Item
|
|
%type: Id int
|
|
%mandatory: Title
|
|
|
|
Id: 10
|
|
Title: Notebook (big)
|
|
|
|
Id: 11
|
|
Title: Fountain Pen
|
|
@end example
|
|
|
|
Note that the names of special fields always start with the character
|
|
@code{%}. Also note that it is also possible to use non-special
|
|
fields in a record descriptor, but such fields will have no effect on
|
|
the described record set.
|
|
|
|
Every record set must contain one, and only one, field named
|
|
@code{%rec}. It is not mandated that that field must occupy the first
|
|
position in the record. However, it is considered a good style to
|
|
place it as the first field in the record set, in order for the casual
|
|
reader to easily identify the type of the records.
|
|
|
|
The following list briefly describes the special fields defined in the
|
|
recutils format, along with references to the sections of this manual
|
|
describing their usage in depth.
|
|
|
|
@cindex special fields, list of
|
|
@table @code
|
|
@item %rec
|
|
Naming record types. Also, they allow using external and remote
|
|
descriptors. @xref{Remote Descriptors}.
|
|
@item %mandatory, %allowed and %prohibit
|
|
Requiring or forbidding specific fields. @xref{Mandatory Fields}.
|
|
@xref{Prohibited Fields}. @xref{Allowed Fields}.
|
|
@item %unique and %key
|
|
Working with keys. @xref{Keys and Unique Fields}.
|
|
@item %doc
|
|
Documenting your database. @xref{Documenting Records}.
|
|
@item %typedef and %type
|
|
Field types. @xref{Field Types}.
|
|
@item %auto
|
|
Auto-counters and time-stamps. @xref{Auto-Generated Fields}.
|
|
@item %sort
|
|
Keeping your record sets sorted. @xref{Sorted Output}.
|
|
@item %size
|
|
Restricting the size of your database. @xref{Size Constraints}.
|
|
@item %constraint
|
|
Enforcing arbitrary constraints. @xref{Arbitrary Constraints}.
|
|
@item %confidential
|
|
Storing confidential information. @xref{Encryption}.
|
|
@item %singular
|
|
Fields without repeating values.
|
|
@end table
|
|
|
|
@node Querying Recfiles
|
|
@chapter Querying Recfiles
|
|
|
|
Since recfiles are always human readable, you could lookup data simply
|
|
by opening an editor and searching for the desired information. Or
|
|
you could use a standard tool such as @command{grep} to extract
|
|
strings matching a pattern. However, recutils provides a more powerful
|
|
and flexible way to lookup data. The following sections explore how
|
|
the recutils can be used in order to extract data from recfiles, from
|
|
very basic and simple queries to quite complex examples.
|
|
|
|
@menu
|
|
* Simple Selections:: Introducing @command{recsel}.
|
|
* Selecting by Type:: Get the records of some given type.
|
|
* Selecting by Position:: Get the record occupying some position.
|
|
* Random Records:: Get a set of random records.
|
|
* Selection Expressions:: Get the records satisfying some expression.
|
|
* Field Expressions:: Selecting a subset of fields.
|
|
* Sorted Output:: Get the records in a given order.
|
|
@end menu
|
|
|
|
@node Simple Selections
|
|
@section Simple Selections
|
|
|
|
@command{recsel} is an utility whose primary purpose is to select
|
|
records from a recfile and print them on standard output.
|
|
Consider the following example record set, which we shall assume is
|
|
saved in a recfile called @file{acquaintances.rec}:
|
|
|
|
@example
|
|
# This database contains a list of both real and fictional people
|
|
# along with their age.
|
|
|
|
Name: Ada Lovelace
|
|
Age: 36
|
|
|
|
Name: Peter the Great
|
|
Age: 53
|
|
|
|
# Name: Matusalem
|
|
# Age: 969
|
|
|
|
Name: Bart Simpson
|
|
Age: 10
|
|
|
|
Name: Adrian Mole
|
|
Age: 13.75
|
|
@end example
|
|
|
|
@noindent
|
|
If we invoke @command{recsel acquaintances.rec} we will get a list of
|
|
all the records stored in the file in the terminal:
|
|
|
|
@example
|
|
$ recsel acquaintances.rec
|
|
Name: Ada Lovelace
|
|
Age: 36
|
|
|
|
Name: Peter the Great
|
|
Age: 53
|
|
|
|
Name: Bart Simpson
|
|
Age: 10
|
|
|
|
Name: Adrian Mole
|
|
Age: 13.75
|
|
@end example
|
|
|
|
@noindent
|
|
Note that the commented out parts of the file, in this case the
|
|
explanatory header and the record corresponding to Matusalem, are not
|
|
part of the output produced by @command{recsel}. This is because
|
|
@command{recsel} is concerned only with the data.
|
|
|
|
@command{recsel} will also ``pack'' the records so any extra empty
|
|
lines that may be between records are not echoed in the output:
|
|
|
|
@multitable @columnfractions .5 .5
|
|
@item
|
|
@example
|
|
@strong{acquaintances.rec:}
|
|
|
|
Name: Peter the Great
|
|
Age: 53
|
|
|
|
# Note the extra empty lines.
|
|
|
|
|
|
Name: Bart Simpson
|
|
Age: 10
|
|
@end example
|
|
@tab
|
|
@example
|
|
$ recsel acquaintances.rec
|
|
Name: Peter the Great
|
|
Age: 53
|
|
|
|
Name: Bart Simpson
|
|
Age: 10
|
|
@end example
|
|
@end multitable
|
|
|
|
@noindent
|
|
It is common to store data gathered in several recfiles.
|
|
For example, we could have a @file{contacts.rec} file containing
|
|
general contact records, and also a @file{work-contacts.rec} file
|
|
containing business contacts:
|
|
|
|
@multitable @columnfractions .5 .5
|
|
@item
|
|
@example
|
|
@strong{contacts.rec:}
|
|
|
|
Name: Granny
|
|
Phone: +12 23456677
|
|
|
|
Name: Doctor
|
|
Phone: +12 58999222
|
|
@end example
|
|
@tab
|
|
@example
|
|
@strong{work-contacts.rec:}
|
|
|
|
Name: Yoyodyne Corp.
|
|
Email: sales@@yoyod.com
|
|
Phone: +98 43434433
|
|
|
|
Name: Robert Harris
|
|
Email: robert.harris@@yoyod.com
|
|
Note: Sales Department.
|
|
@end example
|
|
@end multitable
|
|
|
|
Both files can be passed to @command{recsel} in the command line. In
|
|
that case @command{recsel} will simply process them and output their
|
|
records in the same order they were specified:
|
|
|
|
@example
|
|
$ recsel contacts.rec work-contacts.rec
|
|
Name: Granny
|
|
Phone: +12 23456677
|
|
|
|
Name: Doctor
|
|
Phone: +12 58999222
|
|
|
|
Name: Yoyodyne Corp.
|
|
Email: sales@@yoyod.com
|
|
Phone: +98 43434433
|
|
|
|
Name: Robert Harris
|
|
Email: robert.harris@@yoyod.com
|
|
Note: Sales Department.
|
|
@end example
|
|
|
|
@noindent
|
|
As mentioned above, the output follows the ordering on the command
|
|
line, so @command{recsel work-contacts.rec
|
|
contacts.rec} would output the records of @file{work-contacts.rec} first
|
|
and then the ones from @file{contacts.rec}.
|
|
|
|
@noindent
|
|
Note however that @command{recsel} will merge records from several
|
|
files specified in the command line only if they are anonymous. If
|
|
the contacts in our files were typed:
|
|
|
|
@multitable @columnfractions .5 .5
|
|
@item
|
|
@example
|
|
@strong{contacts.rec:}
|
|
|
|
%rec: Contact
|
|
|
|
Name: Granny
|
|
Phone: +12 23456677
|
|
|
|
Name: Doctor
|
|
Phone: +12 58999222
|
|
@end example
|
|
@tab
|
|
@example
|
|
@strong{work-contacts.rec:}
|
|
|
|
%rec: Contact
|
|
|
|
Name: Yoyodyne Corp.
|
|
Email: sales@@yoyod.com
|
|
Phone: +98 43434433
|
|
|
|
Name: Robert Harris
|
|
Email: robert.harris@@yoyod.com
|
|
Note: Sales Department.
|
|
@end example
|
|
@end multitable
|
|
|
|
@noindent
|
|
Then we would get the following error message:
|
|
|
|
@example
|
|
$ recsel contacts.rec work-contacts.rec
|
|
recsel: error: duplicated record set 'Contact' from work-contacts.rec.
|
|
@end example
|
|
|
|
|
|
@node Selecting by Type
|
|
@section Selecting by Type
|
|
|
|
As we saw in the section discussing record descriptors, it is possible
|
|
to have several different types of records in a single recfile.
|
|
Consider for example a @file{gnu.rec} file containing information
|
|
about maintainers and packages in the GNU Project:
|
|
|
|
@example
|
|
%rec: Maintainer
|
|
|
|
Name: Jose E. Marchesi
|
|
Email: jemarch@@gnu.org
|
|
|
|
Name: Luca Saiu
|
|
Email: positron@@gnu.org
|
|
|
|
%rec: Package
|
|
|
|
Name: GNU recutils
|
|
LastRelease: 12 February 2014
|
|
|
|
Name: GNU epsilon
|
|
LastRelease: 10 March 2013
|
|
@end example
|
|
|
|
@noindent If @command{recsel} is invoked in that file it will complain:
|
|
|
|
@example
|
|
$ recsel gnu.rec
|
|
recsel: error: several record types found. Please use -t to specify one.
|
|
@end example
|
|
|
|
@noindent
|
|
This is because @command{recsel} does not know which records to
|
|
output: the maintainers or the packages. This can be resolved by
|
|
using the @code{-t} command line option:
|
|
|
|
@example
|
|
$ recsel -t Package gnu.rec
|
|
Name: GNU recutils
|
|
LastRelease: 12 February 2014
|
|
|
|
Name: GNU epsilon
|
|
LastRelease: 10 March 2013
|
|
@end example
|
|
|
|
@noindent
|
|
By default @command{recsel} never outputs record descriptors. This is
|
|
because most of the time the user is only interested in the data.
|
|
However, with the @code{-d} command line option, the record descriptor
|
|
of the selected type is printed preceding the data records:
|
|
|
|
@example
|
|
$ recsel -d -t Maintainer gnu.rec
|
|
%rec: Maintainer
|
|
|
|
Name: Jose E. Marchesi
|
|
Email: jemarch@@gnu.org
|
|
|
|
Name: Luca Saiu
|
|
Email: positron@@gnu.org
|
|
@end example
|
|
|
|
@noindent
|
|
Note that at the moment it is not possible to select non-typed
|
|
(default) records when other record sets are stored in the same file.
|
|
This is one of the reasons why mixing non-typed records and typed
|
|
records in a single recfile is not recommended.
|
|
|
|
@noindent
|
|
Note also that if a nonexistent record type is specified in @code{-t}
|
|
then @command{recsel} does nothing.
|
|
|
|
@node Selecting by Position
|
|
@section Selecting by Position
|
|
|
|
As was explained in the previous sections, @command{recsel} outputs
|
|
all the records of some record set. The records are echoed in the
|
|
same order they are written in the recfile. However, often it is
|
|
desirable to select a subset of the records, determined by the position
|
|
they occupy in their record set.
|
|
|
|
The @code{-n} command line option to @command{recsel} supports doing
|
|
this in a natural way. This is how we would retrieve the first
|
|
contact listed in a contacts database using @command{recsel}:
|
|
|
|
@example
|
|
$ recsel -n 0 contacts.rec
|
|
Name: Granny
|
|
Phone: +12 23456677
|
|
@end example
|
|
|
|
@noindent
|
|
Note that the index is zero-based. If we want to retrieve more
|
|
records we can specify several indexes to @code{-n} separated by
|
|
commas. If a given index is too big, it is simply ignored:
|
|
|
|
@example
|
|
$ recsel -n 0,1,999 contacts.rec
|
|
Name: Granny
|
|
Phone: +12 23456677
|
|
|
|
Name: Doctor
|
|
Phone: +12 58999222
|
|
@end example
|
|
|
|
@noindent With @code{-n}, the order in which the records are echoed does not
|
|
depend on the order of the indexes passed to @code{-n}.
|
|
For example, the output of @command{recsel -n 0,1} will be
|
|
identical to the output of @command{recsel -n 1,0}.
|
|
|
|
Ranges of indexes can also be used to select a subset of the records.
|
|
For example, the following call would also select the first three
|
|
contacts of the database:
|
|
|
|
@example
|
|
$ recsel -n 0-2 contacts.rec
|
|
Name: Granny
|
|
Phone: +12 23456677
|
|
|
|
Name: Doctor
|
|
Phone: +12 58999222
|
|
|
|
Name: Dad
|
|
Phone: +12 88229900
|
|
@end example
|
|
|
|
@noindent It is possible to mix single indexes and index
|
|
ranges in the same call. For example, @command{recsel -n 0,5-6} would
|
|
select the first, sixth and seventh records.
|
|
|
|
@node Random Records
|
|
@section Random Records
|
|
|
|
Consider a database in which each record is a cooking recipe. It is
|
|
always difficult to decide what to cook each day, so it would be nice
|
|
if we could ask @command{recsel} to pick up a random recipe. This can
|
|
be achieved using the @code{-m} (@code{--random}) command line option
|
|
of @command{recsel}:
|
|
|
|
@example
|
|
$ recsel -m 1 recipes.rec
|
|
Title: Curry chicken
|
|
Ingredient: A whole chicken
|
|
Ingredient: Curry
|
|
Preparation: ...
|
|
@end example
|
|
|
|
@noindent If we need two recipes, because we will be cooking at
|
|
both lunch and dinner, we can pass a different number to @code{-m}:
|
|
|
|
@example
|
|
$ recsel -m 2 recipes.rec
|
|
Title: Fabada Asturiana
|
|
Ingredient: 300 gr of fabes.
|
|
Ingredient: Chorizo
|
|
Ingredient: Morcilla
|
|
Preparation: ...
|
|
|
|
Title: Pasta with ragu
|
|
Ingredient: 500 gr of spaghetti.
|
|
Ingredient: 2 tomatoes.
|
|
Ingredient: Minced meat.
|
|
Preparation: ...
|
|
@end example
|
|
|
|
@noindent
|
|
The algorithm used to implement @code{-m} guarantees that
|
|
you will never get multiple instances of the same record. This means
|
|
that if a record set has @var{n} records and you ask for @var{n}
|
|
random records, you will get all the records in a random order.
|
|
|
|
@node Selection Expressions
|
|
@section Selection Expressions
|
|
|
|
@cindex selection expressions
|
|
@dfn{Selection expressions}, also known as ``sexes'' in recutils
|
|
jargon, are infix expressions that can be applied to a record.
|
|
A ``sex'' is a predicate which selects a subset of records within a recfile.
|
|
They can be simple expressions involving just one operator and a pair of
|
|
operands, or complex compound expressions with parenthetical sub-expressions
|
|
and many operators and operands.
|
|
One of their most common uses is to examine records matching a particular
|
|
set of conditions.
|
|
|
|
@menu
|
|
* Selecting by predicate:: Selecting records which satisfy conditions.
|
|
* SEX Operands:: Literal values, fields and sub-expressions.
|
|
* SEX Operators:: Arithmetic, logical and other operators.
|
|
* SEX Evaluation:: Selection expressions are like generators.
|
|
@end menu
|
|
|
|
@node Selecting by predicate
|
|
@subsection Selecting by predicate
|
|
@cindex selecting records
|
|
@cindex looking up data
|
|
@cindex retrieving data
|
|
Consider the example recfile @file{acquaintances.rec} introduced earlier.
|
|
It contains names of people along with their respective ages.
|
|
Suppose we want to get a list of the names of all the children.
|
|
It would not be easy to do this using @command{grep}.
|
|
Neither would it, for any reasonably large recfile, be feasible to search
|
|
manually for the children.
|
|
Fortunately the @command{recsel} command provides an easy way to do
|
|
such a lookup:
|
|
@cindex @command{recsel}
|
|
@example
|
|
$ recsel -e "Age < 18" -P Name acquaintances.rec
|
|
Bart Simpson
|
|
Adrian Mole
|
|
@end example
|
|
|
|
@noindent Let us look at each of the arguments to @command{recsel} in turn.
|
|
Firstly we have @code{-e} which tells @command{recsel} to lookup records
|
|
matching the expression @code{Age < 18} --- in other words all those people
|
|
whose ages are less than 18.
|
|
@cindex selection expressions
|
|
This is an example of a @dfn{selection expression}.
|
|
In this case it is a simple test, but it can be as complex as needed.
|
|
|
|
Next, there is @code{-P} which tells @command{recsel} to print out the value of
|
|
the @code{Name} field --- because we want just the name, not the entire record.
|
|
The final argument is the name of the file from whence the records are
|
|
to come: @file{acquaintances.rec}.
|
|
|
|
Rather than explicitly storing ages in the recfile, a more realistic example
|
|
might have the date of birth instead
|
|
(otherwise it would be necessary to update the people's ages in the
|
|
recfile on every birthday).
|
|
|
|
@example
|
|
# Date of Birth
|
|
%type: Dob date
|
|
|
|
Name: Alfred Nebel
|
|
Dob: 20 April 2010
|
|
Email: alf@@example.com
|
|
|
|
Name: Bertram Worcester
|
|
Dob: 3 January 1966
|
|
Email: bert@@example.com
|
|
|
|
Name: Charles Spencer
|
|
Dob: 4 July 1997
|
|
Email: charlie@@example.com
|
|
|
|
Name: Dirk Hogart
|
|
Dob: 29 June 1945
|
|
Email: dirk@@example.com
|
|
|
|
Name: Ernest Wright
|
|
Dob: 26 April 1978
|
|
Email: ernie@@example.com
|
|
@end example
|
|
|
|
@noindent Now we can achieve a similar result as before, by looking up
|
|
the names of all those people who were born after a particular date:
|
|
@example
|
|
$ recfix acquaintances.rec
|
|
$ recsel -e "Dob >> '31 July 1994'" -p Name acquaintances.rec
|
|
Name: Alfred Nebel
|
|
Name: Charles Spencer
|
|
@end example
|
|
|
|
@cindex date comparison
|
|
@noindent The @code{>>} operator means ``later than'', and is used
|
|
here to select a date of birth after 31st July 1994.
|
|
Note also that this example uses a lower case @code{-p} whereas the preceding example
|
|
used the upper case @code{-P}. The difference is that @code{-p} prints the field name
|
|
and field value, whereas @code{-P} prints just the value.
|
|
|
|
@command{recsel} accepts more than one @code{-e} argument,
|
|
each introducing a selection expression,
|
|
in which case the records which satisfy all expressions are selected.
|
|
You can provide more than one field label to @code{-P} or @code{-p} in order to select
|
|
additional fields to be displayed.
|
|
For example, if you wanted to send an email to all children 14 to 18
|
|
years of age,
|
|
and today's date were @w{1st August} 2012, then you could use the following command to get
|
|
the name and email address of all such children:
|
|
|
|
@example
|
|
$ recfix acquaintances.rec
|
|
$ recsel -e "Dob >> '31 July 1994' && Dob << '01 August 1998'" \
|
|
-p Name,Email acquaintances.rec
|
|
Name: Charles Spencer
|
|
Email: charlie@@example.com
|
|
@end example
|
|
|
|
@noindent As you can see, there is only one such child in our record set.
|
|
|
|
@cindex quotation marks
|
|
Note that the example command shown above contains both double quotes @code{"} and
|
|
single quotes @code{'}.
|
|
@cindex date comparison
|
|
The double quotes are interpreted by the shell (@eg{} @command{bash}) and
|
|
the single quotes are interpreted by @command{recsel}, defining a
|
|
string. (And the backslash is interpreted by the shell, the usual
|
|
continuation character so that this manual doesn't have a too-long line.)
|
|
|
|
|
|
@node SEX Operands
|
|
@subsection SEX Operands
|
|
|
|
@cindex operands, SEX operands
|
|
The supported operands are: numbers, strings, field names and
|
|
parenthesized expressions.
|
|
|
|
@subsubsection Numeric Literals
|
|
|
|
@cindex literals, numeric literals
|
|
The supported numeric literals are integer numbers and real numbers.
|
|
The usual sign character @samp{-} is used to denote negative values.
|
|
Integer values can be denoted in base 10, base 16 using the @code{0x}
|
|
prefix, and base 8 using the @code{0} prefix. Examples are:
|
|
|
|
@example
|
|
10000
|
|
0
|
|
0xFF
|
|
-0xa
|
|
012
|
|
-07
|
|
-1342
|
|
.12
|
|
-3.14
|
|
@end example
|
|
|
|
@subsubsection String Literals
|
|
@cindex literals, string literals
|
|
String values are delimited by either the @code{'} character or the
|
|
@code{"} character. Whichever delimiter is used, the delimiter closing
|
|
the literal must be the same as the delimiter used to open it.
|
|
|
|
Newlines and tabs can be part of a string literal.
|
|
|
|
Examples are:
|
|
|
|
@example
|
|
'Hello.'
|
|
'The following example is the empty string.'
|
|
''
|
|
@end example
|
|
|
|
@cindex quotation marks
|
|
The @code{'} and @code{"} characters can be part of a string if they
|
|
are escaped with a backslash, as in:
|
|
|
|
@example
|
|
'This string contains an apostrophe: \'.'
|
|
"This one a double quote: \"."
|
|
@end example
|
|
|
|
@subsubsection Field Values
|
|
@cindex field values, in selection expressions
|
|
The value of a field value can be included in a selection expression
|
|
by writing its name. The field name is replaced by a string
|
|
containing the field value, to handle the possibility of records with
|
|
more than one field by that name. Examples:
|
|
|
|
@example
|
|
Name
|
|
Email
|
|
long_field_name
|
|
@end example
|
|
|
|
It is possible to use the role part of a field if it is not empty.
|
|
So, for example, if we are searching for the issues opened by
|
|
@samp{John Smith} in a database of issues we could write:
|
|
|
|
@example
|
|
$ recsel -e "OpenedBy = 'John Smith'"
|
|
@end example
|
|
|
|
@noindent
|
|
instead of using a full field name:
|
|
|
|
@example
|
|
$ recsel -e "Hacker:Name:OpenedBy = 'John Smith'"
|
|
@end example
|
|
|
|
When the name of a field appears in an expression, the expression is
|
|
applied to all the fields in the record featuring that name. So, for
|
|
example, the expression:
|
|
|
|
@example
|
|
Email ~ "\\.org"
|
|
@end example
|
|
|
|
@noindent
|
|
matches any record in which there is a field named @samp{Email}
|
|
whose value terminates in (the literal string) @samp{.org}.
|
|
If we are interested in the value of some specific email, we can specify
|
|
its relative position in the containing record by using @dfn{subscripts}.
|
|
@cindex subscripts, in selection expressions
|
|
Consider, for example:
|
|
|
|
@example
|
|
Email[0] ~ "\\.org"
|
|
@end example
|
|
|
|
@noindent
|
|
Will match for:
|
|
|
|
@example
|
|
Name: Mr. Foo
|
|
Email: foo@@foo.org
|
|
Email: mr.foo@@foo.com
|
|
@end example
|
|
|
|
@noindent
|
|
But not for:
|
|
|
|
@example
|
|
Name: Mr. Foo
|
|
Email: mr.foo@@foo.com
|
|
Email: foo@@foo.org
|
|
@end example
|
|
|
|
The regexp syntax supported in selection expressions is POSIX
|
|
EREs, with several GNU extensions. @xref{Regular Expressions}.
|
|
|
|
@subsubsection Parenthesized Expressions
|
|
@cindex parentheses, in selection expressions.
|
|
Parenthesis characters (@code{(} and @code{)}) can be used to group
|
|
sub expressions in the usual way.
|
|
|
|
@node SEX Operators
|
|
@subsection Operators
|
|
|
|
@cindex operators, in selection expressions
|
|
The supported operators are arithmetic operators (addition,
|
|
subtraction, multiplication, division and modulus), logical operators,
|
|
string operators and field operators.
|
|
|
|
@subsubsection Arithmetic Operators
|
|
@cindex arithmetic operators
|
|
@cindex operators, arithmetic operators
|
|
|
|
Arithmetic operators for addition (@code{+}), subtraction (@code{-}),
|
|
multiplication (@code{*}), integer division (@code{/}) and modulus
|
|
(@code{%}) are supported with their usual meanings.
|
|
|
|
These operators require either numeric operands or string operands
|
|
whose value can be interpreted as numbers (integer or real).
|
|
|
|
@subsubsection Boolean Operators
|
|
@cindex boolean operators
|
|
@cindex operators, boolean operators
|
|
|
|
The boolean operators @strong{and} (@code{&&}), @strong{or}
|
|
(@code{||}) and @strong{not} (@code{!})@: are supported with the same
|
|
semantics as their C counterparts.
|
|
|
|
A compound boolean operator @code{=>} is also supported in order to
|
|
ease the elaboration of constraints in records: @code{A => B}, which
|
|
can be read as ``A implies B'', translates into @code{!A || (A && B)}.
|
|
|
|
The boolean operators expect integer operands, and will try to convert
|
|
any string operand to an integer value.
|
|
|
|
@subsubsection Comparison Operators
|
|
|
|
@cindex operators, comparison operators
|
|
@cindex comparison
|
|
The compare operators @strong{less than} (@code{<}), @strong{greater
|
|
than} (@code{>}), @strong{less than or equal} (@code{<=}),
|
|
@strong{greater than or equal} (@code{>=}), @strong{equal} (@code{=})
|
|
and @strong{unequal} (@code{!=}) are supported with their usual
|
|
meaning.
|
|
|
|
Strings can be compared with the equality operator (@code{=}).
|
|
|
|
The match operator (@code{~}) can be used to match a string with a
|
|
given regular expression (@pxref{Regular Expressions}).
|
|
|
|
@subsubsection Date Comparison Operators
|
|
@cindex date comparison
|
|
The compare operators @strong{before} (@code{<<}), @strong{after}
|
|
(@code{>>}) and @strong{same time} (@code{==}) can be used with fields
|
|
and strings containing parseable dates.
|
|
|
|
@xref{Date input formats}.
|
|
|
|
@subsubsection Field Operators
|
|
@cindex field operators
|
|
@cindex counting occurrences of a field
|
|
Field counters are replaced by the number of occurrences of a field
|
|
with the given name in the record. For example:
|
|
|
|
@example
|
|
#Email
|
|
@end example
|
|
|
|
The previous expression is replaced with the number of fields named
|
|
@code{Email} in the record. It can be zero if the record does not
|
|
have a field with that name.
|
|
|
|
@subsubsection String Operators
|
|
@cindex string operators
|
|
@cindex operators, string operators
|
|
The string concatenation operator (@code{&}) can be used to
|
|
concatenate any number of strings and field values.
|
|
|
|
@example
|
|
'foo' & Name & 'bar'
|
|
@end example
|
|
|
|
@subsubsection Conditional Operator
|
|
@cindex conditional operator
|
|
@cindex operators, conditional operator
|
|
The ternary conditional operator can be used to select alternatives
|
|
based on the value of some expression:
|
|
|
|
@example
|
|
expr1 ? expr2 : expr3
|
|
@end example
|
|
|
|
If @code{expr1} evaluates to true (@ie{} it is an integer or the string
|
|
representation of an integer and its value is not zero) then the
|
|
operator yields @code{expr2}. Otherwise it yields @code{expr3}.
|
|
|
|
@node SEX Evaluation
|
|
@subsection Evaluation of Selection Expressions
|
|
@cindex evaluation, of selection expressions
|
|
|
|
Given that:
|
|
|
|
@itemize @minus
|
|
@item It is possible to refer to fields by name in selection expressions.
|
|
@item Records can have several fields with the same name.
|
|
@end itemize
|
|
|
|
@noindent
|
|
It is clear that some backtracking mechanism is needed in the
|
|
evaluation of the selection expressions. For example, consider the
|
|
following expression that is deciding whether a ``registration'' in a
|
|
webpage should be rejected:
|
|
|
|
@example
|
|
((Email ~ "foomail\.com") || (Age <= 18)) && !#Fixed
|
|
@end example
|
|
|
|
The previous expression will be evaluated for every possible
|
|
permutation of the fields ``Email'', ``Age'' and ``Fixed'' present in
|
|
the record, until one of the combinations succeeds. At that point the
|
|
computation is interrupted.
|
|
|
|
When used to decide whether a record matches some criteria, the goal
|
|
of a selection expression is to act as a boolean expression. In that
|
|
case the final value of the expression depends on both the type and
|
|
the value of the result launched by the top-most subexpression:
|
|
|
|
@itemize @minus
|
|
@item If the result is an @b{integer}, the expression is true if its
|
|
value is not zero.
|
|
@item If the result is a @b{real}, or a @b{string}, the expression
|
|
evaluates to false.
|
|
@end itemize
|
|
|
|
Sometimes a selection expression is used to compute a result instead
|
|
of a boolean. In that case the returned value is converted to a
|
|
string. This is used when replacing the slots in templates
|
|
(@pxref{Templates}).
|
|
|
|
@node Field Expressions
|
|
@section Field Expressions
|
|
|
|
@cindex field expressions
|
|
@cindex FEX
|
|
|
|
@dfn{Field expressions} (also known as ``fexes'') are a way to select
|
|
fields of a record. They also allow you to do certain transformations
|
|
on the selected fields, such as changing their names.
|
|
|
|
A FEX comprises a sequence of @dfn{elements} separated by commas:
|
|
|
|
@example
|
|
ELEM_1,ELEM_2,@dots{},ELEM_N
|
|
@end example
|
|
|
|
Each element makes a reference to one or more fields in a record
|
|
identified by a given name and an optional subscript:
|
|
|
|
@example
|
|
@var{Field_Name}[@var{min}-@var{max}]
|
|
@end example
|
|
|
|
@noindent
|
|
@var{min} and @var{max} are zero-based indexes. It is possible to
|
|
refer to a field occupying a given position. For example, consider
|
|
the following record:
|
|
|
|
@example
|
|
Name: Mr. Foo
|
|
Email: foo@@foo.com
|
|
Email: foo@@foo.org
|
|
Email: mr.foo@@foo.org
|
|
@end example
|
|
|
|
@noindent
|
|
We would select all the emails of the record with:
|
|
|
|
@example
|
|
Email
|
|
@end example
|
|
|
|
@noindent
|
|
The first email with:
|
|
|
|
@example
|
|
Email[0]
|
|
@end example
|
|
|
|
@noindent
|
|
The third email with:
|
|
|
|
@example
|
|
Email[2]
|
|
@end example
|
|
|
|
@noindent
|
|
The second and the third email with:
|
|
|
|
@example
|
|
Email[1-2]
|
|
@end example
|
|
|
|
And so on. It is possible to select the same field (or
|
|
range of fields) more than once just by repeating them in a field
|
|
expression. Thus, the field expression:
|
|
|
|
@example
|
|
Email[0],Name,Email
|
|
@end example
|
|
|
|
@noindent
|
|
will print the first email, the name, and then all the email fields
|
|
including the first one.
|
|
|
|
@cindex aliasing, field name aliasing
|
|
It is possible to include a @dfn{rewrite rule} in an element of a
|
|
field expression, which specifies an alias for the selected fields:
|
|
|
|
@example
|
|
@var{Field_Name}[@var{min}-@var{max}]:@var{Alias}
|
|
@end example
|
|
|
|
@noindent
|
|
For example, the following field expression specifies an alias for the
|
|
fields named @code{Email} in a record:
|
|
|
|
@example
|
|
Name,Email:ElectronicMail
|
|
@end example
|
|
|
|
Since the rewrite rules only affect the fields selected in a single
|
|
element of the field expression, it is possible to define different
|
|
aliases to several fields having the same name but occupying different
|
|
positions:
|
|
|
|
@example
|
|
Name,Email[0]:PrimaryEmail,Email[1]:SecondaryEmail
|
|
@end example
|
|
|
|
@noindent
|
|
When that field expression is applied to the following record:
|
|
|
|
@example
|
|
Name: Mr. Foo
|
|
Email: primary@@email.com
|
|
Email: secondary@@email.com
|
|
Email: other@@email.com
|
|
@end example
|
|
|
|
@noindent
|
|
the result will be:
|
|
|
|
@example
|
|
Name: Mr. Foo
|
|
PrimaryEmail: primary@@email.com
|
|
SecondaryEmail: secondary@@email.com
|
|
Email: other@@email.com
|
|
@end example
|
|
|
|
It is possible to use the dot notation in order to refer to field and
|
|
sub-fields. This is mainly used in the context of joins, where new
|
|
fields are created having compound names such as @code{Foo_Bar}. A
|
|
reference to such a field can be done in the fex using dot notation
|
|
as follows:
|
|
|
|
@example
|
|
Foo.Bar
|
|
@end example
|
|
|
|
@node Sorted Output
|
|
@section Sorted Output
|
|
|
|
@cindex @code{%sort}
|
|
@cindex sorting
|
|
This special field sets sorting criteria for the records
|
|
contained in a record set. Its usage is:
|
|
|
|
@example
|
|
%sort: @var{field1} @var{field2} ...
|
|
@end example
|
|
|
|
@noindent
|
|
Meaning that the desired order for the records will be determined by
|
|
the contents of the fields named in the @code{%sort} value. The
|
|
sorting is always done in ascending order, and there may be records
|
|
that lack the involved fields, @ie{} the sorting
|
|
fields need not be mandatory.
|
|
|
|
It is an error to have more than one @code{%sort} field in the same
|
|
record descriptor, as only one field list can be used as sorting
|
|
criteria.
|
|
|
|
Consider for example that we want to keep the records in our inventory
|
|
system ordered by entry date. We could achieve that by using the
|
|
following record descriptor in the database:
|
|
|
|
@example
|
|
%rec: Item
|
|
%type: Date date
|
|
%sort: Date
|
|
|
|
Id: 1
|
|
Title: Staplers
|
|
Date: 10 February 2011
|
|
|
|
Id: 2
|
|
Title: Ruler Pack 20
|
|
Date: 2 March 2009
|
|
|
|
@dots{}
|
|
@end example
|
|
|
|
@noindent
|
|
As you can see in the example above, the fact we use @code{%sort} in a
|
|
database does not mean that the database will be always physically
|
|
ordered. Unsorted record sets are not a data integrity
|
|
problem, and thus the diagnosis tools must not declare a recfile as
|
|
+invalid because of this. The utility @command{recfix} provides a way
|
|
+to physically order the fields in the file (@pxref{Invoking recfix}).
|
|
|
|
On the other hand any program listing, presenting or processing data
|
|
extracted from the recfile must honor the @code{%sort} entry. For
|
|
example, when using the following @command{recsel} program in the
|
|
database above we would get the output sorted by date:
|
|
|
|
@example
|
|
$ recsel inventory.rec
|
|
Id: 2
|
|
Title: Ruler Pack 20
|
|
Date: 2 March 2009
|
|
|
|
Id: 1
|
|
Title: Staplers
|
|
Date: 10 February 2011
|
|
@end example
|
|
|
|
@cindex order of fields
|
|
@noindent
|
|
The sorting of the selected field depends on its type:
|
|
|
|
@itemize @minus
|
|
@item Numeric fields (integers, ranges, reals) are numerically ordered.
|
|
@item Boolean fields are ordered considering that ``false'' values come first.
|
|
@item Dates are ordered chronologically.
|
|
@item Any other kind of field is ordered using a lexicographic order.
|
|
@end itemize
|
|
|
|
It is possible to specify several fields as the sorting criteria. In
|
|
that case the records are sorted using a lexicographic order. Consider
|
|
for example the following unsorted database containing marks for
|
|
several students:
|
|
|
|
@example
|
|
%rec: Marks
|
|
%type: Class enum A B C
|
|
%type: Score real
|
|
|
|
Name: Mr. One
|
|
Class: C
|
|
Score: 6.8
|
|
|
|
Name: Mr. Two
|
|
Class: A
|
|
Score: 6.8
|
|
|
|
Name: Mr. Three
|
|
Class: B
|
|
Score: 9.2
|
|
|
|
Name: Mr. Four
|
|
Class: A
|
|
Score: 2.1
|
|
|
|
Name: Mr. Five
|
|
Class: C
|
|
Score: 4
|
|
@end example
|
|
|
|
@noindent
|
|
If we wanted to sort it by @code{Class} and by @code{Score} we would
|
|
insert a @code{%sort} special field in the descriptor, having:
|
|
|
|
@example
|
|
%rec: Marks
|
|
%type: Class enum A B C
|
|
%type: Score real
|
|
%sort: Class Score
|
|
|
|
Name: Mr. Four
|
|
Class: A
|
|
Score: 2.1
|
|
|
|
Name: Mr. Two
|
|
Class: A
|
|
Score: 6.8
|
|
|
|
Name: Mr. Three
|
|
Class: B
|
|
Score: 9.2
|
|
|
|
Name: Mr. Five
|
|
Class: C
|
|
Score: 4
|
|
|
|
Name: Mr. One
|
|
Class: C
|
|
Score: 6.8
|
|
@end example
|
|
|
|
@noindent
|
|
The order of the fields in the @code{%sort} field is
|
|
significant. If we reverse the order in the example above then we get
|
|
a different sorted set:
|
|
|
|
@example
|
|
%rec: Marks
|
|
%type: Class enum A B C
|
|
%type: Score real
|
|
%sort: Score Class
|
|
|
|
Name: Mr. Four
|
|
Class: A
|
|
Score: 2.1
|
|
|
|
Name: Mr. Five
|
|
Class: C
|
|
Score: 4
|
|
|
|
Name: Mr. Two
|
|
Class: A
|
|
Score: 6.8
|
|
|
|
Name: Mr. One
|
|
Class: C
|
|
Score: 6.8
|
|
|
|
Name: Mr. Three
|
|
Class: B
|
|
Score: 9.2
|
|
@end example
|
|
|
|
@noindent
|
|
In this last case, @code{Mr. One} comes after @code{Mr. Two} because the
|
|
class @code{A}
|
|
comes before the class @code{B} even though the score is the same (@code{6.8}).
|
|
|
|
|
|
@node Editing Records
|
|
@chapter Editing Records
|
|
|
|
The simplest way of editing a recfile is to start your favourite
|
|
text editor and hack the contents of the file as desired. However,
|
|
the rec format is structured enough so recfiles can be updated
|
|
automatically by programs. This is useful for writing shell scripts
|
|
or when there are complex data integrity rules stored in the file that
|
|
we want to be sure to preserve.
|
|
|
|
The following sections discuss the usage of the recutils for altering
|
|
recfiles in the level of record: adding new records, deleting or
|
|
commenting them out, sorting them, @etc{}
|
|
|
|
@menu
|
|
* Inserting Records:: Inserting data into recfiles.
|
|
* Deleting Records:: Removing data.
|
|
* Sorting Records:: Physical reordering of records.
|
|
@end menu
|
|
|
|
@node Inserting Records
|
|
@section Inserting Records
|
|
|
|
Adding new records to a recfile is pretty trivial: open it with your
|
|
text editor and just write down the fields comprising the records.
|
|
This is really the best way to add contents to a recfile containing
|
|
simple data. However, complex databases may introduce some
|
|
difficulties:
|
|
|
|
@table @emph
|
|
@item Multi-line values.
|
|
It can be tedious to manually encode the several lines.
|
|
@item Data integrity.
|
|
It is difficult to manually maintain the integrity of data stored
|
|
in the data base.
|
|
@item Counters and timestamps.
|
|
Some record sets feature auto-generated fields, which are commonly
|
|
used to implement counters and time-stamps. @xref{Auto-Generated
|
|
Fields}.
|
|
@end table
|
|
|
|
Thus, to facilitate the insertion of new data a command line utility called
|
|
@command{recins} is included in the recutils. The usage of @command{recins} is
|
|
very simple, and can be used both in the command line or called from
|
|
another program. The following subsections discuss several aspects of
|
|
using this utility.
|
|
|
|
@menu
|
|
* Adding Records With recins:: Basics of the @command{recins} utility.
|
|
* Replacing Records With recins:: Substituting records in a file.
|
|
* Adding Anonymous Records:: Inserting or replacing records with no
|
|
type.
|
|
@end menu
|
|
|
|
@node Adding Records With recins
|
|
@subsection Adding Records With recins
|
|
|
|
Each invocation of @command{recins} adds one record to the targeted
|
|
database. The fields comprising the records are specified using pairs
|
|
of @code{-f} and @code{-v} command line arguments. For example, this
|
|
is how we would add the first entry to a previously empty contacts
|
|
database:
|
|
|
|
@example
|
|
$ recins -f Name -v "Mr Foo" -f Email -v foo@@bar.baz contacts.rec
|
|
$ cat contacts.rec
|
|
Name: Mr. Foo
|
|
Email: foo@@bar.baz
|
|
@end example
|
|
|
|
@noindent
|
|
If we invoke @command{recins} again on the same database we will be adding a
|
|
second record:
|
|
|
|
@example
|
|
$ recins -f Name -v "Mr Bar" -f Email -v bar@@gnu.org contacts.rec
|
|
$ cat contacts.rec
|
|
Name: Mr. Foo
|
|
Email: foo@@bar.baz
|
|
|
|
name: Mr. Bar
|
|
Email: bar@@gnu.org
|
|
@end example
|
|
|
|
There is no limit on the number of @code{-f} @code{-v} pairs that can
|
|
be specified to @command{recins}, other than any limit on command line arguments
|
|
which may be imposed by the shell.
|
|
|
|
The field values provided using @code{-v} are encoded to follow the
|
|
rec format conventions, including multi-line field values.
|
|
Consider the following example:
|
|
|
|
@example
|
|
$ recins -f Name -v "Mr. Foo" -f Address -v '
|
|
Foostrs. 19
|
|
Frankfurt am Oder
|
|
Germany' contacts.rec
|
|
$ cat contacts.rec
|
|
Name: Mr. Foo
|
|
Address:
|
|
+ Foostrs. 19
|
|
+ Frankfurt am Oder
|
|
+ Germany
|
|
@end example
|
|
|
|
It is also possible to provide fields already encoded as rec data for
|
|
their addition, using the @code{-r} command line argument. This
|
|
argument can be intermixed with @code{-f} @code{-v}.
|
|
|
|
@example
|
|
$ recins -f Name -v "Mr. Foo" -r "Email: foo@@bar.baz" contacts.rec
|
|
$ cat contacts.rec
|
|
Name: Mr. Foo
|
|
Email: foo@@bar.baz
|
|
@end example
|
|
|
|
If the string passed to @code{-r} is not valid rec data then
|
|
@command{recins} will complain with an error and the operation will be
|
|
aborted.
|
|
|
|
At this time, it is not possible to add new records
|
|
containing comments.
|
|
|
|
@node Replacing Records With recins
|
|
@subsection Replacing Records With recins
|
|
|
|
@command{recins} can also be used to replace existing records in a
|
|
database with a provided record. This is done by specifying some
|
|
criteria selecting the record (or records) to be replaced.
|
|
|
|
Consider for example the following command applied to our contacts
|
|
database:
|
|
|
|
@example
|
|
$ recins -e "Email = 'foo@@bar.baz'" -f Name -v "Mr. Foo" \
|
|
-f Email -v "new@@bar.baz" contacts.rec
|
|
@end example
|
|
|
|
@noindent
|
|
The contact featuring an email @code{foo@@bar.baz} gets replaced with
|
|
the following record:
|
|
|
|
@example
|
|
Name: Mr. Foo
|
|
Email: new@@bar.baz
|
|
@end example
|
|
|
|
The records to be replaced can also be specified by index, or a
|
|
range of indexes. For example, the following command replaces the
|
|
first, second and third records in a database with dummy records:
|
|
|
|
@example
|
|
$ recins -n 0,1-2 -f Dummy -v XXX foo.rec
|
|
$ cat foo.rec
|
|
Dummy: XXX
|
|
|
|
Dummy: XXX
|
|
|
|
Dummy: XXX
|
|
|
|
... Other records ...
|
|
@end example
|
|
|
|
@node Adding Anonymous Records
|
|
@subsection Adding Anonymous Records
|
|
|
|
In a previous chapter we noted that @command{recsel} interprets the
|
|
absence of a @command{-t} argument depending on the actual contents of
|
|
the file. If the recfile contains records of just one type the
|
|
command assumes that the user is referring to these records.
|
|
|
|
@command{recins} does not follow this convention, and the absence of
|
|
an explicit type always means to insert (or replace) an anonymous
|
|
record. Consider for example the following database:
|
|
|
|
@example
|
|
%rec: Marks
|
|
%type: Class enum A B C
|
|
|
|
Name: Alfred
|
|
Class: A
|
|
|
|
Name: Bertram
|
|
Class: B
|
|
@end example
|
|
|
|
@noindent
|
|
If we want to insert a new mark we have to specify the type explicitly
|
|
using @command{-t}:
|
|
|
|
@example
|
|
$ cat marks.rec | recins -t Marks -f Name -v Xavier -f Class -v C
|
|
%rec: Marks
|
|
%type: Class enum A B C
|
|
|
|
Name: Alfred
|
|
Class: A
|
|
|
|
Name: Bertram
|
|
Class: B
|
|
|
|
Name: Xavier
|
|
Class: C
|
|
@end example
|
|
|
|
@noindent
|
|
If we forget to specify the type then an anonymous record is created
|
|
instead:
|
|
|
|
@example
|
|
$ cat marks.rec | recins -f Name -v Xavier -f Class -v C
|
|
Name: Xavier
|
|
Class: C
|
|
|
|
%rec: Marks
|
|
%type: Class enum A B C
|
|
|
|
Name: Alfred
|
|
Class: A
|
|
|
|
Name: Bertram
|
|
Class: B
|
|
@end example
|
|
|
|
@node Deleting Records
|
|
@section Deleting Records
|
|
@cindex deleting records
|
|
|
|
Just as @code{recins} inserts records, the utility @code{recdel} deletes them.
|
|
Consider the following recfile @file{stock.rec}:
|
|
@example
|
|
%rec: Item
|
|
%type: Expiry date
|
|
%sort: Title
|
|
|
|
Title: First Aid Kit
|
|
Expiry: 2 May 2009
|
|
|
|
Title: Emergency Rations
|
|
Expiry: 10 August 2009
|
|
|
|
Title: Life raft
|
|
Expiry: 2 March 2009
|
|
@end example
|
|
|
|
Suppose we wanted to delete all items
|
|
with an @code{Expiry} value before a certain date, we could do this with the following command:
|
|
|
|
@example
|
|
$ recdel -t Item -e 'Expiry << "5/12/2009"' stock.rec
|
|
@end example
|
|
@noindent
|
|
After running this command, only one record will remain in the file
|
|
(@viz{} the one titled `Emergency Rations') because all the others have expiry dates
|
|
prior to 12 May 2009.
|
|
@footnote{`5/12/2009' means the 12th day of May 2009, @emph{not} the fifth day of December,
|
|
even if your @env{LC_TIME} environment variable has been set to suggest otherwise.}
|
|
The @command{-t} option can be omitted if, and only if, there is no @code{%rec} field
|
|
in the recfile.
|
|
|
|
@command{recdel} tries to warn you if you attempt to perform a delete operation
|
|
which it deems to be too pervasive. In such cases, it will refuse to run,
|
|
unless you give the @command{--force} flag.
|
|
However, you should not rely upon @command{recdel} to protect you, because it cannot
|
|
always correctly guess that you might be deleting more records than intended.
|
|
For this reason, it may be wise to use the @command{-c} flag, which causes
|
|
the relevant records to be commented out, rather than deleted. (And
|
|
of course backups are always wise.)
|
|
|
|
The complete options available to the @command{recdel} command are explained later.
|
|
@xref{Invoking recdel}.
|
|
|
|
@node Sorting Records
|
|
@section Sorting Records
|
|
@cindex sorting
|
|
@cindex sorting, physically
|
|
|
|
In the example above, note the existence of the @code{%sort: Title} line.
|
|
This field was discussed previously (@pxref{Sorted Output}) and, as mentioned, does not
|
|
imply that the records need to be stored in the recfile in any particular order.
|
|
|
|
However, if desired, you can automatically arrange the recfile in that order using
|
|
@command{recfix} with the @command{--sort} flag.
|
|
After running the command
|
|
@example
|
|
$ recfix --sort stock.rec
|
|
@end example
|
|
@noindent
|
|
the file @file{stock.rec} will have its records sorted in alphabetical order
|
|
of the @code{Title} fields, thus:
|
|
@example
|
|
%rec: Item
|
|
%type: Expiry date
|
|
%sort: Title
|
|
|
|
Title: Emergency Rations
|
|
Expiry: 10 August 2009
|
|
|
|
Title: First Aid Kit
|
|
Expiry: 2 May 2009
|
|
|
|
Title: Liferaft
|
|
Expiry: 2 March 2009
|
|
@end example
|
|
|
|
|
|
|
|
@node Editing Fields
|
|
@chapter Editing Fields
|
|
|
|
Fields of a recfile can, of course, be edited manually using an editor and this is often
|
|
the easiest way when only a few fields need to be changed or when the nature of the changes do
|
|
not follow any particular pattern.
|
|
If, however, a large number of similar changes to several records are
|
|
required,the @command{recset} command can make the job easier.
|
|
|
|
The formal description of @command{recset} is presented later
|
|
(@pxref{Invoking recset}). In this chapter some typical usage
|
|
examples are discussed. As with @command{recdel}, @command{recset} if
|
|
used erroneously has the potential to make very pervasive changes,
|
|
which could result in a large loss of data. It is prudent therefore
|
|
to take a copy of a recfile before running such commands.
|
|
|
|
|
|
@menu
|
|
* Adding Fields:: Adding new fields to records.
|
|
* Setting Fields:: Editing field values.
|
|
* Deleting Fields:: Removing or commenting-out fields.
|
|
* Renaming Fields:: Changing the name of a field.
|
|
@end menu
|
|
|
|
|
|
@node Adding Fields
|
|
@section Adding Fields
|
|
@cindex adding fields
|
|
|
|
As mentioned above, the command @command{recins} adds new records to a
|
|
recfile, but it cannot
|
|
add fields to an existing record.
|
|
This task can be achieved automatically using @command{recset} with its @command{-a} flag.
|
|
|
|
Suppose that (after a stock inspection) you wanted to add an `Inspected' field to all the items in
|
|
the recfile.
|
|
The following command could be used.
|
|
@example
|
|
$ recset -t Item -f Inspected -a 'Yes' stock.rec
|
|
@end example
|
|
@noindent
|
|
Here, because no record selection flag was provided, the command affected @emph{all} the
|
|
records of type `Item'.
|
|
We could limit the effect of the command using the @command{-e}, @command{-q},
|
|
@command{-n} or @command{-m} flags.
|
|
For example to add the `Inspected' field to only the first item the following command
|
|
would work:
|
|
@example
|
|
$ recset -t Item -n 0 -f Inspected -a 'Yes' stock.rec
|
|
@end example
|
|
@noindent
|
|
Similarly, a selection expression could have been used with the @command{-e} flag in order to
|
|
add the field only to records which satisfy the expression.
|
|
|
|
If you use @command{recset} with the @command{-a} flag on a field that already exists, a
|
|
new field (in addition to those already present) will be appended with the given value.
|
|
|
|
|
|
@node Setting Fields
|
|
@section Setting Fields
|
|
@cindex mutating field values
|
|
|
|
It is also possible to update the value of a field.
|
|
This is done using @command{recset} with its @command{-s} flag.
|
|
In the previous example, an `Inspected' flag was added to certain records,
|
|
with the value `yes'.
|
|
After reflection, one might want to record the date of inspection, rather than
|
|
a simple yes/no flag.
|
|
Records which have no such field will remain unchanged.
|
|
@example
|
|
$ recset -t Item -f Inspected -s '30 October 2006' stock.rec
|
|
@end example
|
|
Although the above command does not have any selection criteria, it will
|
|
only affect those records for which a `Inspected' field exists.
|
|
This is because the @command{-s} flag only sets values of existing fields.
|
|
It will not create any fields.
|
|
|
|
If instead the @command{-S} flag is used, this will create the field
|
|
(if it does not already exist) @emph{and} set its value.
|
|
@example
|
|
$ recset -t Item -f Inspected -S '30 October 2006' stock.rec
|
|
@end example
|
|
|
|
@node Deleting Fields
|
|
@section Deleting Fields
|
|
@cindex deleting fields
|
|
|
|
You can delete fields using @command{recset}'s @command{-d} flag.
|
|
For example, if we wanted to delete the @code{Inspected} field which we introduced above,
|
|
we could do so as follows:
|
|
@example
|
|
$ recset -t Item -f Inspected -d stock.rec
|
|
@end example
|
|
@noindent
|
|
This would delete @emph{all} fields named @code{Inspected} from @emph{all} records of type
|
|
@code{Item}.
|
|
It may be that, we only wanted to delete the @code{Inspected} fields from records which satisfy
|
|
a certain condition.
|
|
The following would delete the fields only from items whose @code{Expiry} date was before
|
|
2 January 2010:
|
|
@example
|
|
$ recset -t Item -e 'Expiry << "2 January 2010"' -f Inspected -d stock.rec
|
|
@end example
|
|
|
|
|
|
@node Renaming Fields
|
|
@section Renaming Fields
|
|
@cindex renaming fields
|
|
|
|
Another use of @command{recset} is to rename existing fields. This is achieved using the
|
|
@command{-r} flag.
|
|
To rename all instances of the @code{Expiry} field occurring in any
|
|
record of type @code{Item} to @code{UseBy},
|
|
the following command suffices:
|
|
@example
|
|
$ recset -t Item -f Expiry -r 'UseBy' stock.rec
|
|
@end example
|
|
@noindent
|
|
As with most operations, this could be done selectively, using the @command{-e} flag and a
|
|
selection expression.
|
|
|
|
|
|
@node Field Types
|
|
@chapter Field Types
|
|
|
|
Field values are, by default, unrestricted text strings. However, it
|
|
is often useful to impose some restrictions on the values of certain
|
|
fields. For example, consider the following record:
|
|
|
|
@example
|
|
Id: 111
|
|
Name: Jose E. Marchesi
|
|
Age: 30
|
|
MaritalStatus: single
|
|
Phone: +49 666 666 66
|
|
@end example
|
|
|
|
The values of the fields must clearly follow some structure in order
|
|
to make sense. @code{Id} is a numeric identifier for a
|
|
person. @code{Name} will never use several lines. @code{Age} will
|
|
typically be in the range @code{0..120}, and there are only a few
|
|
valid values for @code{MaritalStatus}: single, married, divorced, and
|
|
widow(er).
|
|
Phones may be restricted to some standard format as well to be valid.
|
|
All these restrictions (and many others) can be enforced by using
|
|
@dfn{field types}.
|
|
|
|
There are two kind of field types: @dfn{anonymous} and @dfn{named}. Those are
|
|
described in the following subsections.
|
|
|
|
@menu
|
|
* Declaring Types:: Declaration of types in record descriptors.
|
|
* Types and Fields:: Associating fields with types.
|
|
* Scalar Field Types:: Numbers and ranges.
|
|
* String Field Types:: Lines, limited strings and regular expressions.
|
|
* Enumerated Field Types:: Enumerations and boolean values.
|
|
* Date and Time Types:: Dates and times.
|
|
* Other Field Types:: Emails, fields, UUIDs, @dots{}
|
|
@end menu
|
|
|
|
@node Declaring Types
|
|
@section Declaring Types
|
|
|
|
A type can be declared in a record descriptor by using the
|
|
@code{%typedef} special field. The syntax is:
|
|
|
|
@example
|
|
%typedef: @var{type_name} @var{type_description}
|
|
@end example
|
|
|
|
@noindent
|
|
Where @var{type_name} is the name of the new type, and
|
|
@var{type_description} a description which varies depending of the
|
|
kind of type.
|
|
@cindex @code{range}, type description
|
|
For example, this is how a type @code{Age_t} could
|
|
be defined as numbers in the range @code{0..120}:
|
|
|
|
@example
|
|
%typedef: Age_t range 0 120
|
|
@end example
|
|
|
|
@noindent
|
|
Type names are identifiers having the following syntax:
|
|
|
|
@example
|
|
[a-zA-Z][a-zA-Z0-9_]*
|
|
@end example
|
|
|
|
@noindent
|
|
Even though any identifier with that syntax could be used for types,
|
|
it is a good idea to consistently follow some convention to help
|
|
distinguishing type names from field names. For example, the
|
|
@code{_t} suffix could be used for types.
|
|
|
|
A type can be declared to be an alias for another type. The syntax
|
|
is:
|
|
|
|
@example
|
|
%typedef: @var{type_name} @var{other_type_name}
|
|
@end example
|
|
|
|
@noindent
|
|
Where @var{type_name} is declared to be a synonym of
|
|
@var{other_type_name}. This is useful to avoid duplicated type
|
|
descriptions. For example, consider the following example:
|
|
|
|
@example
|
|
%typedef: Id_t int
|
|
%typedef: Item_t Id_t
|
|
%typedef: Transaction_t Id_t
|
|
@end example
|
|
|
|
@noindent
|
|
Both @code{Item_t} and @code{Transaction_t} are aliases for the type
|
|
@code{Id_t}. Which is in turn an alias for the type @code{int}.
|
|
So, they are both numeric identifiers.
|
|
|
|
The order of the @code{%typedef} fields is not relevant. In
|
|
particular, a type definition can forward-reference another type that is defined
|
|
subsequently. The previous example could have been written as:
|
|
|
|
@example
|
|
%typedef: Item_t Id_t
|
|
%typedef: Transaction_t Id_t
|
|
%typedef: Id_t int
|
|
@end example
|
|
|
|
@noindent
|
|
@cindex integrity problems
|
|
Integrity check will complain if undefined types are referenced. As well as when any aliases up referencing back (looping back
|
|
directly or indirectly) in type declarations. For
|
|
example, the following set of declarations contains a loop.
|
|
Thus, it's invalid:
|
|
|
|
@example
|
|
%typedef: A_t B_t
|
|
%typedef: B_t C_t
|
|
%typedef: C_t A_t
|
|
@end example
|
|
|
|
@noindent
|
|
The scope of a type is the record descriptor where it is defined.
|
|
|
|
@node Types and Fields
|
|
@section Types and Fields
|
|
|
|
@cindex @code{%type}
|
|
@cindex @code{%typedef}
|
|
|
|
@cindex types
|
|
@cindex field types,
|
|
Fields can be declared to have a given type by using the @code{%type}
|
|
special field in a record descriptor. The synopsis is:
|
|
|
|
@example
|
|
%type: @var{field_list} @var{type_name_or_description}
|
|
@end example
|
|
|
|
@noindent
|
|
Where @var{field_list} is a list of field names separated by commas.
|
|
@var{type_name_or_description} can be either a type name which has
|
|
been previously declared using @code{%typedef}, or a type description.
|
|
Type names are useful when several fields are declared to be of the
|
|
same type:
|
|
|
|
@example
|
|
%typedef: Id_t int
|
|
%type: Id Id_t
|
|
%type: Product Id_t
|
|
@end example
|
|
|
|
@cindex anonymous types
|
|
@noindent
|
|
Anonymous types can be specified by writing a type description instead
|
|
of a type name. They help to avoid superfluous type declarations in
|
|
the common case where a type is used by just one field. A record
|
|
containing a single @code{Id} field, for example, can be defined
|
|
without having to use a @code{%typedef} in the following way:
|
|
|
|
@example
|
|
%rec: Task
|
|
%type: Id int
|
|
@end example
|
|
|
|
@node Scalar Field Types
|
|
@section Scalar Field Types
|
|
|
|
The rec format supports the declaration of fields of the following
|
|
scalar types: integer numbers, ranges and real numbers.
|
|
|
|
@cindex integers
|
|
Signed @dfn{integers} are supported by using the @code{int}
|
|
declaration:
|
|
|
|
@example
|
|
%typedef: Id_t int
|
|
@end example
|
|
|
|
@cindex hexadecimal
|
|
@cindex octal
|
|
@noindent
|
|
Given the declaration above, fields of type @code{Id_t} must
|
|
contain integers, and they may be negative. Hexadecimal values can be written
|
|
using the @code{0x} prefix, and octal values using an extra
|
|
@code{0}. Valid examples are:
|
|
|
|
@example
|
|
%type: Id Id_t
|
|
|
|
Id: 100
|
|
Id: -23
|
|
Id: -0xFF
|
|
Id: 020
|
|
@end example
|
|
|
|
@cindex ranges
|
|
@noindent
|
|
Sometimes it is desirable to reduce the @dfn{range} of integers allowed in a
|
|
field. This can be achieved by using a range type declaration:
|
|
|
|
@example
|
|
%typedef: Interrupt_t range 0 15
|
|
@end example
|
|
|
|
@noindent
|
|
Note that it is possible to omit the minimum index in ranges. In that
|
|
case it is implicitly zero:
|
|
|
|
@example
|
|
%typedef: Interrupt_t range 15
|
|
@end example
|
|
|
|
@noindent
|
|
It is possible to use the keywords @code{MIN} and @code{MAX} instead
|
|
of a numeral literal in one or both of the points conforming the
|
|
range. They mean the minimum and the maximum integer value supported
|
|
by the implementation respectively. See the following examples:
|
|
|
|
@example
|
|
%typedef: Negative range MIN -1
|
|
%typedef: Positive range 0 MAX
|
|
%typedef: AnyInt range MIN MAX
|
|
%typedef: Impossible range MAX MIN
|
|
@end example
|
|
|
|
@noindent
|
|
Hexadecimal and octal numbers can be used to specify the limits in a
|
|
range. This helps to define scalar types whose natural base is not
|
|
ten, like for example:
|
|
|
|
@example
|
|
%typedef: Address_t range 0x0000 0xFFFF
|
|
%typedef: Perms_t range 755
|
|
@end example
|
|
|
|
@cindex reals
|
|
@cindex fractions
|
|
@cindex floating point numbers
|
|
@noindent
|
|
@dfn{Real} number fields can be declared with the @code{real} type
|
|
specifier.
|
|
A wide range of real numbers can be represented this way, only limited
|
|
by the underlying floating point representation.
|
|
@cindex decimal separator
|
|
@cindex locale
|
|
The decimal separator is always the dot (@code{.}) character regardless
|
|
of the locale setting.
|
|
For example:
|
|
|
|
@example
|
|
%typedef: Longitude_t real
|
|
@end example
|
|
|
|
@noindent
|
|
Examples of fields of type real:
|
|
|
|
@example
|
|
%rec: Rectangle
|
|
%typedef: Longitude_t real
|
|
%type: Width Longitude_t
|
|
%type: Height Longitude_t
|
|
|
|
Width: 25.01
|
|
Height: 10
|
|
@end example
|
|
|
|
|
|
@node String Field Types
|
|
@section String Field Types
|
|
|
|
@cindex strings
|
|
The @code{line} field type specifier can be used to restrict the value
|
|
of a field to a single line, @ie{} no newline characters are allowed.
|
|
For example, a type for proper names could be declared as:
|
|
|
|
@example
|
|
%typedef: Name_t line
|
|
@end example
|
|
|
|
@noindent
|
|
Examples of fields of type line:
|
|
|
|
@cindex multiline field values
|
|
@example
|
|
Name: Mr. Foo Bar
|
|
Name: Mrs. Bar Baz
|
|
Name: This is
|
|
+ invalid
|
|
@end example
|
|
|
|
@cindex field size
|
|
@cindex size, field size
|
|
@cindex @code{size}, type description
|
|
|
|
@noindent
|
|
Sometimes it is the maximum size of the field value that shall be
|
|
restricted. The @code{size} field type specifier can be used to
|
|
define the maximum number of characters a field value can have. For
|
|
example, if we were collecting input that will get written in a
|
|
paper-based forms system allowing up to 25 characters width entries,
|
|
we could declare the entries as:
|
|
|
|
@example
|
|
%typedef: Address_t size 25
|
|
@end example
|
|
|
|
@noindent
|
|
Note that hexadecimal and octal integer constants can also be used to
|
|
specify field sizes:
|
|
|
|
@example
|
|
%typedef: Address_t size 0x18
|
|
@end example
|
|
|
|
@cindex restricting values of fields
|
|
@noindent
|
|
Arbitrary restrictions can be defined by using regular expressions.
|
|
@cindex @code{regexp}, type description
|
|
The @dfn{regexp} field type specifier introduces an ERE (extended
|
|
regular expression) that will be matched against fields having that
|
|
name. The synopsis is:
|
|
|
|
@example
|
|
%typedef: @var{type_name} regexp /@var{re}/
|
|
@end example
|
|
|
|
@noindent
|
|
where @var{re} is the regular expression to match.
|
|
|
|
For example, consider the @code{Id_t} type designed to represent
|
|
the encoding of the identifier of ID cards in some country:
|
|
|
|
@example
|
|
%typedef: Id_t regexp /[0-9]@{9@}[a-zA-Z]/
|
|
@end example
|
|
|
|
@noindent
|
|
Examples of fields of type @code{Id_t} are:
|
|
|
|
@example
|
|
IDCard: 123456789Z
|
|
IDCard: invalid id card
|
|
@end example
|
|
|
|
@noindent
|
|
Note that the slashes delimiting the @var{re} can be replaced with
|
|
any other character that is not itself used as part of the regexp.
|
|
That is useful in some cases such as:
|
|
|
|
@example
|
|
%typedef: Path_t regexp |(/[^/]/?)+|
|
|
@end example
|
|
|
|
@noindent
|
|
The regexp flavor supported in recfiles are the POSIX EREs plus
|
|
several GNU extensions. @xref{Regular Expressions}.
|
|
|
|
@node Enumerated Field Types
|
|
@section Enumerated Field Types
|
|
|
|
@cindex enumerated types
|
|
Fields of this type contain symbols taken from an enumeration.
|
|
|
|
The type is described by writing the sequence of symbols comprising
|
|
the enumeration. Enumeration symbols are strings described by the
|
|
following regexp:
|
|
|
|
@example
|
|
[a-zA-Z0-9][a-zA-Z0-9_-]*
|
|
@end example
|
|
|
|
@noindent
|
|
The symbols are separated by blank characters (including newlines).
|
|
For example:
|
|
|
|
@cindex day of week
|
|
@example
|
|
%typedef: Status_t enum NEW STARTED DONE CLOSED
|
|
%typedef: Day_t enum Monday Tuesday Wednesday Thursday Friday
|
|
+ Saturday Sunday
|
|
@end example
|
|
|
|
@noindent
|
|
@cindex comments, in enumerated types
|
|
It is possible to insert comments when describing an enum type. The
|
|
comments are delimited by parenthesis pairs. The contents of the
|
|
comments can be any character but parentheses. For example:
|
|
|
|
@example
|
|
%typedef: TaskStatus_t enum
|
|
+ NEW (The task was just created)
|
|
+ IN_PROGRESS (Task started)
|
|
+ CLOSED (Task closed)
|
|
@end example
|
|
|
|
@noindent
|
|
@cindex boolean types
|
|
@dfn{Boolean} fields, declared with the type specifier @code{bool},
|
|
can be seen as special enumerations holding the
|
|
binary values true and false.
|
|
|
|
@example
|
|
%typedef: Yesno_t bool
|
|
@end example
|
|
|
|
@noindent
|
|
The literals allowed in boolean fields are @code{yes/no}, @code{0/1}
|
|
and @code{true/false}. Examples are:
|
|
|
|
@example
|
|
SwitchedOn: 1
|
|
SwitchedOn: yes
|
|
SwitchedOn: false
|
|
@end example
|
|
|
|
@node Date and Time Types
|
|
@section Date and Time Types
|
|
|
|
@cindex date, fields containing dates
|
|
@cindex time, fields containing time values
|
|
The @dfn{date} field type specifier can be used to declare dates and
|
|
times. The synopsis is:
|
|
|
|
@example
|
|
%typedef: @var{type_name} date
|
|
@end example
|
|
|
|
@cindex locale
|
|
@cindex time zone correction
|
|
@noindent
|
|
There are many permitted date formats, described in detail later in this manual (@pxref{Date input formats}).
|
|
Of particular note are the following:
|
|
@itemize @minus
|
|
@item Dates and times read from recfiles are not affected by the
|
|
locale or the timezone. This means that the @env{LC_TIME} and the
|
|
@env{TZ} environment variables are ignored.
|
|
If you wish, for example, to specify a time which must be interpreted as UTC, you
|
|
must explicitly append the time zone correction: @eg{} @samp{2001-1-10 12:09Z}.
|
|
@item The field value `1/10/2001' means January 10, 2001, @strong{not} October 1, 2001.
|
|
@item Relative times and dates (such as `1 day ago') are permitted but are not
|
|
particularly useful.
|
|
@end itemize
|
|
|
|
@node Other Field Types
|
|
@section Other Field Types
|
|
|
|
@cindex email
|
|
The @dfn{Email} field type specifier is used to declare electronic
|
|
addresses. The synopsis is:
|
|
|
|
@example
|
|
%typedef: Email_t email
|
|
@end example
|
|
|
|
@noindent
|
|
Sometimes it is useful to make fields to store field names. For that
|
|
purpose the @dfn{Field} field type specifier is supported. The
|
|
synopsis is:
|
|
|
|
@example
|
|
%typedef: Field_t field
|
|
@end example
|
|
|
|
@noindent
|
|
@cindex UUID
|
|
Universally Unique Identifiers (also known as UUIDs) are a way to
|
|
assign a globally unique label to some object. The @dfn{uuid} field
|
|
type specifier serves that purpose. The synopsis is:
|
|
|
|
@example
|
|
%typedef: Id_t uuid
|
|
@end example
|
|
|
|
@noindent
|
|
The format of the uuids is specified as 32 hexadecimal digits,
|
|
displayed in five groups separated by hyphens. For example:
|
|
|
|
@example
|
|
550e8400-e29b-41d4-a716-446655440000
|
|
@end example
|
|
|
|
@noindent
|
|
@cindex foreign key
|
|
There is one other possible field type, @viz{} a foreign key.
|
|
The following example
|
|
defines the type @code{Maintainer_t} to be of type ``record @code{Hacker}'';
|
|
in other words, a foreign key referring to a record in the @code{Hacker} record set.
|
|
@example
|
|
%typedef: Maintainer_t rec Hacker
|
|
@end example
|
|
@noindent This essentially means that the values
|
|
to be stored in fields of type @code{Maintainer_t} are of whatever
|
|
type is defined for the primary key of the @code{Hacker} record set.
|
|
Why this is useful is discussed later. @xref{Queries which Join Records}.
|
|
|
|
@node Constraints on Record Sets
|
|
@chapter Constraints on Record Sets
|
|
|
|
The records in a recfile are by default not restricted to any particular
|
|
structure
|
|
except that they must contain one or more fields and optional comments.
|
|
This provides the format with huge expressive power;
|
|
but in many cases, it is also desirable to impose some restrictions in
|
|
order to reflect some of the properties of the data stored in the
|
|
database. It is also useful in order to preserve data integrity and
|
|
thus avoid data corruption.
|
|
|
|
The following sections describe the usage of some predefined special
|
|
fields whose purpose is to impose this kind of restriction in the
|
|
structure of the records.
|
|
|
|
@menu
|
|
* Mandatory Fields:: Requiring the presence of fields.
|
|
* Prohibited Fields:: Forbidding the presence of fields.
|
|
* Allowed Fields:: Restricting the presence of fields.
|
|
* Keys and Unique Fields:: Fields characterizing records.
|
|
* Singular Fields:: Fields with unique contents.
|
|
* Size Constraints:: Constraints on the number of records in a set.
|
|
* Arbitrary Constraints:: Constraints records must comply with.
|
|
@end menu
|
|
|
|
|
|
@node Mandatory Fields
|
|
@section Mandatory Fields
|
|
|
|
@cindex @code{%mandatory}
|
|
@cindex mandatory fields
|
|
@cindex requiring certain fields in records
|
|
@cindex compulsory fields
|
|
|
|
Sometimes, you want to make sure that @emph{every} record of a particular type
|
|
contains certain fields.
|
|
To do this, use the special field @code{%mandatory}.
|
|
The usage is:
|
|
|
|
@example
|
|
%mandatory: @var{field1} @var{field2} @dots{} @var{fieldN}
|
|
@end example
|
|
@noindent
|
|
The field names are separated by one or more
|
|
blank characters.
|
|
|
|
@cindex field, compulsory fields
|
|
@cindex field, mandatory fields
|
|
The fields listed in a @code{%mandatory} entry are
|
|
non-optional; @ie{} at least one field with this name shall be present
|
|
in any record of this kind.
|
|
@cindex integrity problems
|
|
Records violating this restriction are
|
|
invalid and a checking tool will report the situation as
|
|
a data integrity failure.
|
|
|
|
Consider for example an ``address book'' database where each record
|
|
stores the information associated with a contact. The records will be
|
|
heterogeneous, in the sense they won't all contain exactly the same
|
|
fields: the contact of an Internet shop will probably have a
|
|
@code{URL} field, while the entry for our grandmother probably won't.
|
|
We still want to make sure that every entry has a field with the name
|
|
of the contact. In this case, we could use @code{%mandatory} as
|
|
follows:
|
|
|
|
@example
|
|
%rec: Contact
|
|
%mandatory: Name
|
|
|
|
Name: Granny
|
|
Phone: +12 23456677
|
|
|
|
Name: Yoyodyne Corp.
|
|
Email: sales@@yoyod.com
|
|
Phone: +98 43434433
|
|
@end example
|
|
|
|
A word of caution, however: In many situations, especially in day to day social
|
|
interaction, it is common to find that certain information is simply unavailable.
|
|
For example, although every person has a date of birth, some people will refuse
|
|
to provide that information.
|
|
|
|
It is probably wise therefore to avoid stipulating a field as mandatory, unless it is
|
|
essential to the enterprise.
|
|
Otherwise,
|
|
a data entry clerk faced with this situation will have to make the choice between
|
|
dropping the entry entirely or entering some fake data to keep the system happy.
|
|
|
|
|
|
@node Prohibited Fields
|
|
@section Prohibited Fields
|
|
|
|
@cindex @code{%prohibit}
|
|
@cindex restricting fields from records
|
|
@cindex field, forbidden fields
|
|
@cindex prohibited fields
|
|
|
|
The inverse of @code{%mandatory} is @code{%prohibit}.
|
|
Prohibited fields may not occur in @emph{any} record of the given type.
|
|
The usage is:
|
|
|
|
@example
|
|
%prohibit: @var{field1} @var{field2} @dots{} @var{fieldN}
|
|
@end example
|
|
@noindent The field names are separated by one or more blank characters.
|
|
|
|
@noindent
|
|
Fields listed in a @code{%prohibit} entry are
|
|
forbidden; @ie{} no field with this name should be present
|
|
in any record of this kind.
|
|
Again, records violating this restriction
|
|
are invalid.
|
|
|
|
@noindent
|
|
Several @code{%prohibit} fields can appear in
|
|
the same record descriptor.
|
|
The set of prohibited fields
|
|
is the union of all the entries.
|
|
For example, in the following
|
|
database both @code{Id} and @code{id} are prohibited:
|
|
|
|
@example
|
|
%rec: Entry
|
|
%prohibit: Id
|
|
%prohibit: id
|
|
@end example
|
|
|
|
One possible use case for prohibited fields arises
|
|
when some field name is reserved for some future
|
|
use.
|
|
For example, if we were organizing a sports competition, we would want
|
|
competitors to register before the event.
|
|
However a competitor's @code{result} should not and cannot be entered
|
|
before the competition takes place.
|
|
Initially then, we would change the record
|
|
descriptor as follows:
|
|
|
|
@example
|
|
%rec: Contact
|
|
%mandatory: Name
|
|
%prohibit: result
|
|
@end example
|
|
@noindent
|
|
At the start of the event, the @code{%prohibit} line can be deleted, to
|
|
allow results to be entered.
|
|
|
|
@node Allowed Fields
|
|
@section Allowed Fields
|
|
|
|
@cindex @code{%allowed}
|
|
@cindex restricting fields from records
|
|
@cindex field, allowed fields
|
|
@cindex allowed fields
|
|
|
|
In some cases we know the set of fields that may appear in the records
|
|
of a given type, even if they are not mandatory. The @code{%allowed}
|
|
special field is used to specify this restriction. The usage is:
|
|
|
|
@example
|
|
%allowed: @var{field1} @var{field2} @dots{} @var{fieldN}
|
|
@end example
|
|
@noindent The field names are separated by one or more blank
|
|
chracters.
|
|
|
|
@noindent
|
|
If there are more or one @code{%allowed} fields in a record
|
|
descriptor, all fields of all the records in the record set must be in
|
|
the union of @code{%allowed}, @code{%mandatory} and @code{%key}.
|
|
Otherwise an integrity error is raised.
|
|
|
|
@noindent
|
|
Several @code{%allowed} fields can appear in the same record
|
|
descriptor. The set of allowed fields is the union of all the
|
|
entries.
|
|
|
|
@node Keys and Unique Fields
|
|
@section Keys and Unique Fields
|
|
|
|
@cindex @code{%unique}
|
|
@cindex @code{%key}
|
|
The @code{%unique} and @code{%key} special fields are
|
|
used to avoid several instances of the
|
|
same field in a record, and to implement keys in record sets.
|
|
Their usage is:
|
|
|
|
@example
|
|
%unique: @var{field1} @var{field2} @dots{} @var{fieldN}
|
|
%key: @var{field}
|
|
@end example
|
|
|
|
@noindent
|
|
The field names are separated by one or more blank characters.
|
|
|
|
@cindex unique fields
|
|
Normally it is permitted for a record to contain two or more fields of
|
|
the same name.
|
|
The @code{%unique} special field revokes this permissiveness.
|
|
A field declared ``unique'' cannot appear more than once in a single record.
|
|
|
|
For example, an entry in an address book database could contain an
|
|
@code{Age} field. It does not make sense for a single person to be of
|
|
several ages. So, a field could be declared as ``unique'' in the
|
|
corresponding record descriptor as follows:
|
|
|
|
@example
|
|
%rec: Contact
|
|
%mandatory: Name
|
|
%unique: Age
|
|
@end example
|
|
|
|
@noindent
|
|
Several @code{%unique} fields can appear in the same record
|
|
descriptor. The set of unique fields is the union of all the entries.
|
|
|
|
@cindex primary key
|
|
@code{%key} makes the referenced field the primary key of the record
|
|
set.
|
|
The primary key behaves as if both @code{%unique} and
|
|
@code{%mandatory} had been specified for that field.
|
|
Additionally, there is further restriction, @viz{}
|
|
a given value of a primary key field may appear no more than once within a
|
|
record set.
|
|
|
|
Consider for example a database of items in stock. Each item is
|
|
identified by a numerical @code{Id} field. No item may have more than
|
|
one @code{Id}, and no items may exist without an associated
|
|
@code{Id}. Additionally, no two items may share the same @code{Id}.
|
|
This common situation can be implementing by declaring @code{Id} as
|
|
the key in the record descriptor:
|
|
|
|
@example
|
|
%rec: Item
|
|
%key: Id
|
|
%mandatory: Title
|
|
|
|
Id: 1
|
|
Title: Box
|
|
|
|
Id: 2
|
|
Title: Sticker big
|
|
@end example
|
|
|
|
@noindent
|
|
It would not make sense to have several primary keys in a record set.
|
|
Thus, it is not allowed to have several @code{%key} fields in the
|
|
same record descriptor.
|
|
It is also forbidden for two items to share the same `Id' value.
|
|
@cindex integrity problems
|
|
Both of these situations would be data integrity
|
|
violations, and will be reported by a checking tool.
|
|
|
|
Elsewhere, we discuss how primary keys can be used to link one record set to
|
|
another using primary keys together with foreign keys. @xref{Queries which Join Records}.
|
|
|
|
@node Singular Fields
|
|
@section Singular Fields
|
|
|
|
Sometimes we require certain fields with a given name to not appear in
|
|
a record set featuring the same contents, but we don't want (or we
|
|
can't) declare such fields as the key of the record set.
|
|
|
|
In these circumstances we can use @dfn{singular fields}, which are
|
|
declared as such in the record descriptor using the @code{%singular}
|
|
special field:
|
|
|
|
@example
|
|
%singular: @var{field}
|
|
@end example
|
|
|
|
@node Size Constraints
|
|
@section Size Constraints
|
|
|
|
@cindex @code{%size}
|
|
@cindex size, record size
|
|
@cindex record size
|
|
Sometimes it is desirable to place constraints on entire records.
|
|
This can be done with the @code{%size} special field which is used to limit the
|
|
number of records in a record set. Its usage is:
|
|
|
|
@example
|
|
%size: [@var{relational_operator}] @var{number}
|
|
@end example
|
|
|
|
@noindent
|
|
If no operator is specified then @var{number} is interpreted as the
|
|
exact number of records of this type. The number can be any integer
|
|
literal, including hexadecimal and octal constants. For example:
|
|
|
|
@example
|
|
%rec: Day
|
|
%size: 7
|
|
%type: Name enum
|
|
+ Monday Tuesday Wednesday Thursday Friday
|
|
+ Saturday Sunday
|
|
%doc: There should be exactly 7 days.
|
|
@end example
|
|
|
|
@cindex operators
|
|
The optional @var{relational_operator} shall be one of @code{<},
|
|
@code{<=}, @code{>} and @code{>=}@. For example:
|
|
|
|
@example
|
|
%rec: Item
|
|
%key: Id
|
|
%size: <= 100
|
|
%doc: We have at most 100 different articles.
|
|
@end example
|
|
|
|
It is valid to specify a size of @code{0}, meaning that no records of
|
|
this type shall exist in the file.
|
|
|
|
Only one @code{%size} field shall appear in a record descriptor.
|
|
|
|
|
|
@node Arbitrary Constraints
|
|
@section Arbitrary Constraints
|
|
@cindex @code{%constraint}
|
|
@cindex restricting values of fields
|
|
|
|
Occasionally, @code{%mandatory}, @code{%prohibit} and @code{%size} are just not flexible enough.
|
|
We might, for instance, want to ensure that @emph{if} a field is present,
|
|
then it must have a certain relationship to other fields.
|
|
Or we might want to stipulate that under certain conditions only, a record contains
|
|
a particular field.
|
|
|
|
To this end, recutils provides a way for arbitrary field constraints to be defined.
|
|
These permit restrictions on the presence and/or value of fields, based upon the value or
|
|
presence of other fields within that record.
|
|
This is done using the @code{%constraint} special field.
|
|
Its usage is:
|
|
|
|
@example
|
|
%constraint: @var{expr}
|
|
@end example
|
|
|
|
@noindent
|
|
where @var{expr} is a selection expression (@pxref{Selection Expressions}).
|
|
When a constraint is
|
|
present in a record set it means that all the records of that type
|
|
must satisfy the selection expression, @ie{} the evaluation of the
|
|
expression with the record returns 1. Otherwise an integrity error is
|
|
raised.
|
|
@cindex integrity problems
|
|
|
|
|
|
Consider for example a record type @code{Task} featuring two fields of
|
|
type date called @code{Start} and @code{End}. We can use a constraint
|
|
in the record set to specify that the task cannot start after it
|
|
finishes:
|
|
|
|
@example
|
|
%rec: Task
|
|
%type: Start,End date
|
|
%constraint: Start << End
|
|
@end example
|
|
|
|
@cindex implies, logical implication
|
|
@cindex constraints
|
|
The ``implies'' operator @code{=>} is especially useful when defining
|
|
constraints, since it can be used to specify conditional constraints,
|
|
@ie{} constraints applying only in certain records. For example, we
|
|
could specify that if a task is closed then it must have an @code{End}
|
|
date in the following way:
|
|
|
|
@example
|
|
%rec: Task
|
|
%type: Start,End date
|
|
%constraint: Start << End
|
|
%constraint: Status = 'CLOSED' => #End
|
|
@end example
|
|
|
|
It is acceptable to declare several constraints in the same record
|
|
set.
|
|
|
|
@node Checking Recfiles
|
|
@chapter Checking Recfiles
|
|
|
|
@cindex integrity, checking
|
|
Sometimes, when creating a recfile by hand, typographical errors or other
|
|
mistakes will occur.
|
|
If a recfile contains such mistakes, then one cannot rely upon the results
|
|
of queries or other operations.
|
|
Fortunately
|
|
there is a tool called @command{recfix} which can find these errors.
|
|
It is a good idea to get into the habit of running @command{recfix} on
|
|
a file after editing it, and before trying other commands.
|
|
|
|
|
|
@menu
|
|
* Syntactical Errors:: Fixing structure errors in recfiles.
|
|
* Semantic Errors:: Fixing semantic errors in recfiles.
|
|
@end menu
|
|
|
|
@node Syntactical Errors
|
|
@section Syntactical Errors
|
|
|
|
One easy mistake is to forget the colon separating the field name from
|
|
its value.
|
|
|
|
@example
|
|
%rec: Article
|
|
%key Id
|
|
|
|
Name: Thing
|
|
Id: 0
|
|
@end example
|
|
@cindex @command{recfix}
|
|
@noindent
|
|
Running @command{recfix} on this file will immediately tell us that
|
|
there is a problem:
|
|
|
|
@example
|
|
$ recfix --check inventory.rec
|
|
inventory.rec: 2: error: expected a record
|
|
@end example
|
|
@noindent
|
|
Here, @command{recfix} has diagnosed a problem in the file @file{inventory.rec}
|
|
and the problem lies at line 2.
|
|
If, as in this case, @command{recfix} shows there is a problem with
|
|
the recfile, you should attend to that problem before trying to use
|
|
any other recutils program on that file, otherwise strange things
|
|
could happen.
|
|
The @code{--check} flag is optional but in normal execution not required because that is the
|
|
default operation.
|
|
|
|
@node Semantic Errors
|
|
@section Semantic Errors
|
|
|
|
@cindex special fields
|
|
However @command{recfix} checks more than the syntactical integrity of the recfile.
|
|
It also checks certain semantics and that the data is self-consistent.
|
|
To do this, it uses the special fields of the record, some of which were introduced
|
|
above (@pxref{Constraints on Record Sets}).
|
|
It is a good idea to use the special fields to stipulate the ``enterprise rules''
|
|
of the data.
|
|
|
|
Errors will be reported if any of the following special keywords are present and
|
|
the data does not match the stipulated conditions
|
|
@table @code
|
|
@item %mandatory
|
|
The mandated fields are missing from a record.
|
|
@item %prohibit
|
|
The prohibited fields are present in a record.
|
|
@item %unique
|
|
There is more than one field in a single record of the given name.
|
|
@item %key
|
|
Two or more records share the same value of the field which is the key field.
|
|
@item %typedef and %type
|
|
A field has a value which does not conform to the specified type.
|
|
@item %size
|
|
The number of records does not conform to the specified restriction.
|
|
@item %constraint
|
|
A field does not conform to the specified constraint.
|
|
@item %confidential
|
|
An unencrypted value exists for a confidential field.
|
|
@end table
|
|
|
|
|
|
@node Remote Descriptors
|
|
@chapter Remote Descriptors
|
|
|
|
@cindex @code{%rec}
|
|
The @code{%rec} special field is used for two main purposes: to
|
|
identify a record as a record descriptor, and to provide a name for
|
|
the described record set. The synopsis of the usage of the field is
|
|
the following:
|
|
|
|
@example
|
|
%rec: @var{type} [@var{url_or_file}]
|
|
@end example
|
|
|
|
@noindent
|
|
@var{type} is the name of the kind of records described by the
|
|
descriptor. It is mandatory to specify it, and it follows the same
|
|
lexical conventions used by field names. @xref{Fields}.
|
|
There is a non-enforced convention to use singular nouns, because the
|
|
name makes reference to the type of a single entity, even if it
|
|
applies to all the records contained in the record set. For example,
|
|
the following record set contains transactions, and the type specified
|
|
in the record descriptor is @code{Transaction}.
|
|
|
|
@example
|
|
%rec: Transaction
|
|
|
|
Id: 10
|
|
Title: House rent
|
|
|
|
Id: 11
|
|
Title: Loan
|
|
@end example
|
|
|
|
@noindent
|
|
Only one @code{%rec} field should be in a record descriptor. If
|
|
there are more it is an integrity violation. It is highly
|
|
recommended (but not enforced) to place this field in the first
|
|
position of the record descriptor.
|
|
|
|
Sometimes it is convenient to store records of the same type in
|
|
different files.
|
|
@cindex integrity problems
|
|
The duplication of record descriptors in this case would surely lead to
|
|
consistency problems.
|
|
A possible solution would
|
|
be to keep the record descriptor in a separated file and then include
|
|
it in any operation by using pipes. For example:
|
|
|
|
@example
|
|
$ cat descriptor.rec data.rec | recsel @dots{}
|
|
@end example
|
|
|
|
@cindex external descriptor
|
|
@cindex descriptor, external descriptor
|
|
@noindent
|
|
For those cases it is more convenient to use a @dfn{external
|
|
descriptor}. External descriptors can be built appending a file path
|
|
to the @code{%rec} field value, like:
|
|
|
|
@example
|
|
%rec: FSD_Entry /path/to/file.rec
|
|
@end example
|
|
|
|
The previous example indicates that a record descriptor describing the
|
|
@code{FSD_Entry} records shall be read from the file
|
|
@file{/path/to/file.rec}. A record descriptor for @code{FSD_Entry}
|
|
may not exist in the external file. Both relative and absolute paths
|
|
can be specified there.
|
|
|
|
@cindex URL
|
|
@cindex remote descriptors
|
|
URLs can be used as sources for external descriptors as well. In that
|
|
case we talk about @dfn{remote descriptors}. For example:
|
|
|
|
@example
|
|
%rec: Department http://www.myorg.com/Org.rec
|
|
@end example
|
|
|
|
@noindent
|
|
The URL shall point to a text file containing rec data. If there is a
|
|
record descriptor in the remote file documenting the @code{Department}
|
|
type, it will be used.
|
|
|
|
Note that the local record descriptor can provide additional fields to
|
|
``expand'' the record type. For example:
|
|
|
|
@example
|
|
%rec: FSD_Entry http://www.jemarch.net/downloads/FSD.rec
|
|
%mandatory: Rating
|
|
@end example
|
|
|
|
@noindent
|
|
The record descriptor above is including the contents of the
|
|
@code{FSD_Entry} record descriptor from the URL, and adding them to
|
|
the local record descriptor, that in this case contains just the
|
|
@code{%mandatory} field.
|
|
|
|
If you are using GNU recutils (@pxref{Invoking the Utilities}) to
|
|
process your recfiles, any URL
|
|
schema supported by @code{libcurl} will work.
|
|
|
|
@node Grouping and Aggregates
|
|
@chapter Grouping and Aggregates
|
|
|
|
Grouping and aggregate functions are two related features which
|
|
are useful to extract statistics from a record set, or a
|
|
subset of that record set.
|
|
|
|
@menu
|
|
* Grouping Records:: Combining records by fields.
|
|
* Aggregate Functions:: Statistics and more.
|
|
@end menu
|
|
|
|
@node Grouping Records
|
|
@section Grouping Records
|
|
@cindex grouping
|
|
|
|
Consider a recfile containing a list of items in a shop
|
|
inventory. For each item it is stored its type, its category, its
|
|
price, the date of the last selling operation of an item of that type,
|
|
and the amount of items currently available in stock. A sample of
|
|
such a database could be:
|
|
|
|
@example
|
|
Type: EC Car
|
|
Category: Toy
|
|
Price: 12.2
|
|
LastSell: 20-April-2012
|
|
Available: 623
|
|
|
|
Type: Terria
|
|
Category: Food
|
|
Price: 0.60
|
|
LastSell: 22-April-2012
|
|
Available: 8239
|
|
|
|
Type: Typex
|
|
Category: Office
|
|
Price: 1.20
|
|
LastSell: 22-April-2012
|
|
Available: 10878
|
|
|
|
Type: Notebook
|
|
Category: Office
|
|
Price: 1.00
|
|
LastSell: 21-April-2012
|
|
Available: 77455
|
|
|
|
Type: Sexy Puzzle
|
|
Category: Toy
|
|
Price: 6.20
|
|
LastSell: 6.20
|
|
Available: 12
|
|
@end example
|
|
|
|
@noindent
|
|
Now imagine we are interested in grouping the contents of the
|
|
@code{Items} record set in groups of items of the same category. We
|
|
can do it using the @command{-G} command line argument for
|
|
@command{recsel}. This argument accepts a list of fields separated by
|
|
commas. The argument can be read as ``group by''.
|
|
|
|
In this case we want to group by @code{Category}, so we would do:
|
|
|
|
@example
|
|
$ recsel -G Category
|
|
Type: Terria
|
|
Category: Food
|
|
Price: 0.60
|
|
LastSell: 22-April-2012
|
|
Available: 8239
|
|
|
|
Type: Typex
|
|
Category: Office
|
|
Price: 1.20
|
|
LastSell: 22-April-2012
|
|
Available: 10878
|
|
Type: Notebook
|
|
Price: 1.00
|
|
LastSell: 21-April-2012
|
|
Available: 77455
|
|
|
|
Type: EC Car
|
|
Category: Toy
|
|
Price: 12.2
|
|
LastSell: 20-April-2012
|
|
Available: 623
|
|
Type: Sexy Puzzle
|
|
Price: 6.20
|
|
LastSell: 6.20
|
|
Available: 12
|
|
@end example
|
|
|
|
@noindent
|
|
We can see that the output is three records, corresponding to the three
|
|
different categories of items present in the database.
|
|
However, we are only interested in the types of products in each category,
|
|
so we can remove unwanted information using @code{-p}:
|
|
|
|
@example
|
|
$ recsel -G Category -p Category,Type items.rec
|
|
Category: Food
|
|
Type: Terria
|
|
|
|
Category: Office
|
|
Type: Typex
|
|
Type: Notebook
|
|
|
|
Category: Toy
|
|
Type: EC Car
|
|
Type: Sexy Puzzle
|
|
@end example
|
|
|
|
@noindent
|
|
It is also possible to group by several fields. We could group by
|
|
both @code{Category} and @code{LastSell}:
|
|
|
|
@example
|
|
$ recsel -G Category,LastSell -p Category,LastSell,Type items.rec
|
|
Category: Food
|
|
LastSell: 22-April-2012
|
|
Type: Terria
|
|
|
|
Category: Office
|
|
LastSell: 21-April-2012
|
|
Type: Notebook
|
|
|
|
Category: Office
|
|
LastSell: 22-April-2012
|
|
Type: Typex
|
|
|
|
Category: Toy
|
|
LastSell: 20-April-2012
|
|
Type: EC Car
|
|
|
|
Category: Toy
|
|
LastSell: 6.20
|
|
Type: Sexy Puzzle
|
|
@end example
|
|
|
|
@node Aggregate Functions
|
|
@section Aggregate Functions
|
|
@cindex aggregate function
|
|
|
|
recutils supports @dfn{aggregate functions}. These are so called
|
|
because they accept a record set and a field name as inputs and
|
|
generate a single result. Usually this result is numerical.
|
|
|
|
The supported aggregate functions are the following:
|
|
|
|
@table @code
|
|
@item Count(FIELD)
|
|
Counts the number of occurrences of a field.
|
|
@item Avg(FIELD)
|
|
Calculates the average (mean) of the numerical values of a field.
|
|
@item Sum(FIELD)
|
|
Calculates the sum of the numerical values of a field.
|
|
@item Min(FIELD)
|
|
Calculates the minimum numerical value of a field.
|
|
@item Max(FIELD)
|
|
Calculates the maximum numerical value of a field.
|
|
@end table
|
|
|
|
The aggregate functions are to be invoked in the field expressions in
|
|
@command{recsel}. By default they are applied to the totality of the
|
|
records in a record set. For example, using the items database from
|
|
the previous section, we can do calculations as in the following examples.
|
|
|
|
The SQL aggregate functions can be applied to the totality of the
|
|
tuples in the relation. For example, using the @code{Count} aggregate
|
|
function we can calculate the number of fields named @code{Category}
|
|
present in the record set as follows:
|
|
|
|
@example
|
|
$ recsel -p "Count(Category)" items.rec
|
|
Count_Category: 5
|
|
@end example
|
|
|
|
@noindent
|
|
The result is a field whose name is derived from the function name and
|
|
the field passed as its parameter, separated by an underline. This
|
|
name scheme probably suffices for most purposes, but it is always
|
|
possible to use a rewrite rule to obtain something different:
|
|
|
|
@example
|
|
$ recsel -p "Count(Category):NumCategories" items.rec
|
|
NumCategories: 5
|
|
@end example
|
|
|
|
@noindent
|
|
You can use different letter case in writing the name of the aggregate, and
|
|
this will be reflected in the field name:
|
|
|
|
@example
|
|
$ recsel -p "CoUnT(Category)" items.rec
|
|
CoUnT_Category: 5
|
|
@end example
|
|
|
|
@noindent
|
|
It is possible to use more than one aggregate function in the field
|
|
expression. Suppose we are also interested in the average price of
|
|
the items we sell. We can use the @code{Avg} aggregate:
|
|
|
|
@example
|
|
$ recsel -p "Count(Category),Avg(Price)" items.rec
|
|
Count_Category: 5
|
|
Avg_Price: 4.240000
|
|
@end example
|
|
|
|
@noindent
|
|
Now let's add a field along with an aggregate function to the field
|
|
expression and see what we get:
|
|
|
|
@example
|
|
$ recsel -p "Type,Avg(Price)" items.rec
|
|
Type: EC Car
|
|
Avg_Price: 12.200000
|
|
|
|
Type: Terria
|
|
Avg_Price: 0.600000
|
|
|
|
Type: Typex
|
|
Avg_Price: 1.200000
|
|
|
|
Type: Notebook
|
|
Avg_Price: 1
|
|
|
|
Type: Sexy Puzzle
|
|
Avg_Price: 6.200000
|
|
@end example
|
|
|
|
@noindent
|
|
We get five records! The reason is that when @emph{only} aggregate
|
|
functions are part of the field expression, they are applied to the single
|
|
record that would result from concatenating all the records in the record
|
|
set together. However, when a regular field appears in the field
|
|
expression the aggregate functions are applied to the individual
|
|
records. This is still useful in some cases, such as a database of
|
|
maintainers:
|
|
|
|
@example
|
|
Name: Jose E. Marchesi
|
|
Email: jemarch@@gnu.org
|
|
Email: jemarch@@es.gnu.org
|
|
|
|
Name: Luca Saiu
|
|
Email: positron@@gnu.org
|
|
@end example
|
|
|
|
@noindent
|
|
Lets see how many emails each maintainer has:
|
|
|
|
@example
|
|
$ recsel -p "Name,Count(Email)" maintainers.rec
|
|
Name: Jose E. Marchesi
|
|
Count_Email: 2
|
|
|
|
Name: Luca Saiu
|
|
Count_Email: 1
|
|
@end example
|
|
|
|
@noindent
|
|
Aggregate functions are most useful when we combine them with
|
|
grouping. This is when we are interested in some property of a subset
|
|
of the records in the database. For example, the average prices of
|
|
each item category stored in the database can be obtained by
|
|
executing:
|
|
|
|
@example
|
|
$ recsel -p "Category,Avg(Price)" -G Category items.rec
|
|
Category: Food
|
|
Avg_Price: 0.600000
|
|
|
|
Category: Office
|
|
Avg_Price: 1.100000
|
|
|
|
Category: Toy
|
|
Avg_Price: 9.200000
|
|
@end example
|
|
|
|
@noindent
|
|
If we were interested in the actual prices that result in each average
|
|
we can do:
|
|
|
|
@example
|
|
$ recsel -p "Category,Price,Avg(Price)" -G Category items.rec
|
|
Category: Food
|
|
Price: 0.60
|
|
Avg_Price: 0.600000
|
|
|
|
Category: Office
|
|
Price: 1.20
|
|
Price: 1.00
|
|
Avg_Price: 1.100000
|
|
|
|
Category: Toy
|
|
Price: 12.2
|
|
Price: 6.20
|
|
Avg_Price: 9.200000
|
|
@end example
|
|
|
|
@node Queries which Join Records
|
|
@chapter Queries which Join Records
|
|
|
|
Suppose you wanted to add the residential address of the people in
|
|
the @file{acquaintances.rec} file from
|
|
@ref{Simple Selections}.
|
|
|
|
|
|
One way to do this is as follows:
|
|
@example
|
|
%type: Dob date
|
|
|
|
Name: Alfred Nebel
|
|
Dob: 20 April 2010
|
|
Email: alf@@example.com
|
|
Address: 42 Abbeter Way, Inprooving, WORCS
|
|
Telephone: 01234 5676789
|
|
|
|
Name: Mandy Nebel
|
|
Dob: 21 February 1972
|
|
Email: mandy@@example.com
|
|
Address: 42 Abbeter Way, Inprooving, WORCS
|
|
Telephone: 01234 5676789
|
|
|
|
Name: Bertram Nebel
|
|
Dob: 3 January 1966
|
|
Email: bert@@example.com
|
|
Address: 42 Abbeter Way, Inprooving, WORCS
|
|
Telephone: 01234 5676789
|
|
|
|
Name: Charles Spencer
|
|
Dob: 4 July 1997
|
|
Email: charlie@@example.com
|
|
Address: 2 Serpe Rise, Little Worning, SURREY
|
|
Telephone: 09876 5432109
|
|
|
|
Name: Dirk Spencer
|
|
Dob: 29 June 1945
|
|
Email: dirk@@example.com
|
|
Address: 2 Serpe Rise, Little Worning, SURREY
|
|
Telephone: 09876 5432109
|
|
|
|
Name: Ernest Wright
|
|
Dob: 26 April 1978
|
|
Email: ernie@@example.com
|
|
Address: 1 Wanter Rise, Greater Inncombe, BUCKS
|
|
@end example
|
|
|
|
|
|
This will work fine.
|
|
However you will notice that there are two addresses where more than one person
|
|
live (presumably they are members of the same family).
|
|
This has a number of disadvantages:
|
|
@itemize @minus
|
|
@item You have to type (or copy) the same information several times.
|
|
@item Should a family move house, then you would have to update the addresses (and telephone number) of all the family members.
|
|
@item A typing error in one of the addresses would lead an automatic query to erroneously suggest that the people lived at different addresses.
|
|
@item It unnecessarily increases the size of the recfile.
|
|
@end itemize
|
|
|
|
|
|
@menu
|
|
* Foreign Keys:: Referring to records from another records.
|
|
* Joining Records:: Performing cross-joins.
|
|
@end menu
|
|
|
|
|
|
@node Foreign Keys
|
|
@section Foreign Keys
|
|
|
|
@cindex record sets
|
|
A better way would be to separate the addresses and people into different record sets.
|
|
@cindex duplication, avoiding
|
|
The first record set might look like this:
|
|
|
|
@example
|
|
%rec: Person
|
|
%type: Dob date
|
|
%type: Abode rec Residence
|
|
|
|
|
|
Name: Alfred Nebel
|
|
Dob: 20 April 2010
|
|
Email: alf@@example.com
|
|
Abode: 42AbbeterWay
|
|
|
|
Name: Mandy Nebel
|
|
Dob: 21 February 1972
|
|
Email: mandy@@example.com
|
|
Mobile: 0555 342123
|
|
Abode: 42AbbeterWay
|
|
|
|
Name: Bertram Nebel
|
|
Dob: 3 January 1966
|
|
Email: bert@@example.com
|
|
Abode: 42AbbeterWay
|
|
|
|
Name: Charles Spencer
|
|
Dob: 4 July 1997
|
|
Email: charlie@@example.com
|
|
Abode: 2SerpeRise
|
|
|
|
Name: Dirk Spencer
|
|
Dob: 29 June 1945
|
|
Email: dirk@@example.com
|
|
Mobile: 0555 342123
|
|
Abode: 2SerpeRise
|
|
|
|
Name: Ernest Wright
|
|
Dob: 26 April 1978
|
|
Abode: ChezGrampa
|
|
|
|
@end example
|
|
|
|
@noindent and the second (following in the same file), like this:
|
|
|
|
@example
|
|
|
|
%rec: Residence
|
|
%key: Id
|
|
|
|
Address: 42 Abbeter Way, Inprooving, WORCS
|
|
Telephone: 01234 5676789
|
|
Id: 42AbbeterWay
|
|
|
|
Address: 2 Serpe Rise, Little Worning, SURREY
|
|
Telephone: 09876 5432109
|
|
Id: 2SerpeRise
|
|
|
|
Address: 1 Wanter Rise, Greater Inncombe, BUCKS
|
|
Id: ChezGrampa
|
|
@end example
|
|
|
|
Here you can see that there are two record sets @viz{} @code{Person}
|
|
and @code{Residence}.
|
|
There are six people, but only three residences, because some residences
|
|
accommodate more than one person.
|
|
@cindex @code{%key}
|
|
Note also that the @code{Residence} descriptor has the entry @code{%key: Id}
|
|
whilst the @code{Person} descriptor has @code{%type: Abode rec Residence}.
|
|
@cindex foreign key
|
|
@cindex key, foreign key
|
|
@cindex @code{rec}, type description
|
|
This is because @code{Abode} is the foreign key which identifies the residence
|
|
where a person lives.
|
|
|
|
@cindex readability
|
|
We could have declared the @code{Id} field as @code{%auto}. This would have had
|
|
the advantage that we need not manually update it.
|
|
However, we decided that the @code{Abode} field values in the @code{Person} records
|
|
are better as alphanumeric fields, so that they can contain
|
|
human readable values. In this way, it is self-evident by reading a @code{Person}
|
|
record where that person lives.
|
|
Yet since the @code{Id} field is declared using the @code{%key} special field
|
|
name, you can be sure that you don't accidentally reuse an existing key.
|
|
|
|
@node Joining Records
|
|
@section Joining Records
|
|
|
|
The above example has also added a new field to the @code{Person} record set
|
|
to contain that person's mobile phone number. Note that the @code{Telephone}
|
|
field belongs to the @code{Residence} record set because that contains the telephone
|
|
number of the home,
|
|
whereas @code{Mobile} belongs to @code{Person} since mobile telephones are normally
|
|
used exclusively by one individual.
|
|
|
|
If we want to look up the name and address of a person in our recfile, we can
|
|
use @command{recsel} as before.
|
|
Because we now have more than one record set in the @file{acquaintances.rec}
|
|
file, we have to tell @command{recsel} in which record set we want to
|
|
look up
|
|
records.
|
|
We do this with the @code{-t} flag as follows:
|
|
|
|
@example
|
|
$ recsel -t Person -P Name,Abode acquaintances.rec
|
|
Alfred Nebel
|
|
42AbbeterWay
|
|
|
|
Mandy Nebel
|
|
42AbbeterWay
|
|
|
|
Bertram Nebel
|
|
42AbbeterWay
|
|
|
|
Charles Spencer
|
|
2SerpeRise
|
|
|
|
Dirk Spencer
|
|
2SerpeRise
|
|
|
|
Ernest Wright
|
|
ChezGrampa
|
|
@end example
|
|
|
|
This result tells us the names of all the people in the recfile, as well as
|
|
giving a concise and hopefully effective reminder telling us where they live.
|
|
However these results would not be useful to someone unacquainted with the
|
|
individuals.
|
|
They need a list of names and full addresses.
|
|
We can use @command{recsel} to produce such a list:
|
|
|
|
@example
|
|
$ recsel -t Person -j Abode acquaintances.rec
|
|
Name: Charles Spencer
|
|
Dob: 4 July 1997
|
|
Email: charlie@@example.com
|
|
Abode_Address: 2 Serpe Rise, Little Worning, SURREY
|
|
Abode_Telephone: 09876 5432109
|
|
Abode_Id: 2SerpeRise
|
|
|
|
Name: Dirk Spencer
|
|
Dob: 29 June 1945
|
|
Email: dirk@@example.com
|
|
Mobile: 0555 342123
|
|
Abode_Address: 2 Serpe Rise, Little Worning, SURREY
|
|
Abode_Telephone: 09876 5432109
|
|
Abode_Id: 2SerpeRise
|
|
|
|
Name: Ernest Wright
|
|
Dob: 26 April 1978
|
|
Abode_Address: 1 Wanter Rise, Greater Inncombe, BUCKS
|
|
Abode_Id: ChezGrampa
|
|
@end example
|
|
|
|
The @code{-t} flag we have seen before. It tells @command{recsel} that we want
|
|
to extract records of type @code{Person}.
|
|
@cindex join
|
|
The @code{-j} flag is new. It says that we want to perform a @dfn{join}.
|
|
Specifically we want to join the @code{Person} records according to their
|
|
@code{Abode} field.
|
|
|
|
In the above example, @command{recsel} displays several field names which
|
|
do not appear anywhere in the input @eg{} @code{Abode_Address}.
|
|
This is the @code{Address} field in the record joined by the foreign key @code{Abode}.
|
|
In this example probably only the name and address are of interest.
|
|
The other information such as date of birth is incidental.
|
|
The foreign key @code{Abode_Id} is certainly not wanted in the output since it
|
|
is redundant.
|
|
As usual, you can use the @code{-P} or @code{-p} options to limit the fields
|
|
which will be displayed.
|
|
However the full joined field name, if appropriate, must be specified.
|
|
So the names and addresses without the other information can be retrieved thus:
|
|
|
|
@example
|
|
$ recsel -t Person -j Abode -p Name,Abode_Address acquaintances.rec
|
|
Name: Charles Spencer
|
|
Abode_Address: 2 Serpe Rise, Little Worning, SURREY
|
|
|
|
Name: Dirk Spencer
|
|
Abode_Address: 2 Serpe Rise, Little Worning, SURREY
|
|
|
|
Name: Ernest Wright
|
|
Abode_Address: 1 Wanter Rise, Greater Inncombe, BUCKS
|
|
@end example
|
|
|
|
@node Auto-Generated Fields
|
|
@chapter Auto-Generated Fields
|
|
|
|
@cindex @code{%auto}
|
|
@cindex automatically generated values
|
|
Consider for example a list of articles in stock in a toy store:
|
|
|
|
@example
|
|
%rec: Item
|
|
%key: Description
|
|
|
|
Description: 2cm metal soldier WWII
|
|
Amount: 2111
|
|
|
|
Description: Flying Helicopter Indoor Maxi
|
|
Amount: 8
|
|
|
|
@dots{}
|
|
@end example
|
|
|
|
It would be natural to identify the items by their descriptions, but it
|
|
is also error prone: was it ``Flying Helicopter Indoor Maxi'' or
|
|
``Flying Helicopter Maxi Indoor''? Was ``Helicopter'' in lower case or
|
|
upper case?
|
|
|
|
@cindex primary key
|
|
@cindex key, primary key
|
|
@cindex @code{%key}
|
|
@cindex ID numbers
|
|
Thus it is quite common in databases to use some kind of numeric ``Id'' to
|
|
uniquely identify items like those ones, because numbers are
|
|
easy to increment and manipulate. So we could add a new
|
|
numeric @code{Id} field and use it as the primary key:
|
|
|
|
@example
|
|
%rec: Item
|
|
%key: Id
|
|
%mandatory: Description
|
|
|
|
Id: 0
|
|
Description: 2cm metal soldier WWII
|
|
Amount: 2111
|
|
|
|
Id: 1
|
|
Description: Flying Helicopter Indoor Maxi
|
|
Amount: 8
|
|
|
|
@dots{}
|
|
@end example
|
|
|
|
A problem with this approach is that we must be careful to not assign
|
|
already used ids when we introduce more articles in the
|
|
database. Other than its uniqueness, it is not important which number
|
|
is associated with which article.
|
|
|
|
To ease the management of those Ids database systems use to provide a
|
|
facility called ``auto-counters''. Auto-counters can be implemented in
|
|
recfiles using the @code{%auto} directive in the record descriptor.
|
|
Its usage is:
|
|
|
|
@example
|
|
%auto: @var{field1} @var{field2} @dots{} @var{fieldN}
|
|
@end example
|
|
|
|
@noindent
|
|
The list of field names are separated by one or more blank characters.
|
|
There can be several @code{%auto} fields in the same record
|
|
descriptor, the effective list of auto-generated fields being the
|
|
union of all the entries.
|
|
|
|
When @command{recins} inserts a new record in the recfile, it looks
|
|
for any declared auto field. If any of these fields are not provided
|
|
explicitly in the command line then @command{recins} generates them
|
|
along with the user-provided fields. Such auto fields are generated
|
|
at the beginning of the new records, in the same order they are found
|
|
in the @code{%auto} directives.
|
|
|
|
For example, consider a @file{items.rec} database with an empty record
|
|
set:
|
|
|
|
@example
|
|
%rec: Item
|
|
%key: Id
|
|
%auto: Id
|
|
%mandatory: Description
|
|
@end example
|
|
|
|
@noindent
|
|
If we insert a new record and we do not specify an @code{Id} then it
|
|
will be generated automatically by @command{recins}:
|
|
|
|
@example
|
|
$ recins -t Item -f Description -v 'recutils t-shirts' \
|
|
-f Amount -v 200 \
|
|
items.rec
|
|
$ cat items.rec
|
|
%rec: Item
|
|
%key: Id
|
|
%auto: Id
|
|
%mandatory: Description
|
|
|
|
Id: 0
|
|
Description: recutils t-shirts
|
|
Amount: 200
|
|
@end example
|
|
|
|
@noindent
|
|
The concrete effect of the @code{%auto} directive depends on the type
|
|
of the affected field. The following sections document how.
|
|
|
|
@menu
|
|
* Counters:: Generating incremental Ids.
|
|
* Unique Identifiers:: Generating universally unique Ids.
|
|
* Time-Stamps:: Tracking the creation of records.
|
|
@end menu
|
|
|
|
@node Counters
|
|
@section Counters
|
|
@cindex counters
|
|
|
|
If an auto field is of type @code{integer} or @code{range} then any
|
|
newly generated field will use the ``next biggest'' unused number in the
|
|
record set.
|
|
|
|
Consider the toy inventory database introduced above. We could
|
|
declare the @code{Id} field to be generated automatically:
|
|
|
|
@example
|
|
%rec: Item
|
|
%key: Id
|
|
%type: Id int
|
|
%mandatory: Description
|
|
%auto: Id
|
|
|
|
Id: 0
|
|
Description: 2cm metal soldier WWII
|
|
Amount: 2111
|
|
@end example
|
|
|
|
@noindent
|
|
When the next new item is introduced in the database, @command{recins}
|
|
will note the @code{%auto}, and create a new @code{Id} field for the
|
|
new record with the next-biggest unused integer, since @code{Id} is
|
|
declared to be of type @code{int}. In this example, the new record
|
|
would have an Id of @code{1}. The database can still provide an
|
|
explicit Id for the new record. In that case the field is not
|
|
generated automatically.
|
|
|
|
Note that if no explicit type is defined for an auto generated field
|
|
then it is assumed to be an integer.
|
|
|
|
@node Unique Identifiers
|
|
@section Unique Identifiers
|
|
@cindex unique identifiers
|
|
@cindex uuid
|
|
|
|
Universally Unique Identifiers, often abbreviated as UUIDs, can also
|
|
be auto-generated using recutils. Suppose you maintain a database
|
|
with events featuring the following record descriptor:
|
|
|
|
@example
|
|
%rec: Event
|
|
%key: Id
|
|
%mandatory: Title Date
|
|
@end example
|
|
|
|
@noindent
|
|
What would be appropriate to identify each event? We could use an
|
|
integer and declare it as auto-generated. After adding two events the
|
|
database would look like this:
|
|
|
|
@example
|
|
%rec: Event
|
|
%key: Id
|
|
%mandatory: Title Date
|
|
|
|
Id: 0
|
|
Title: Team meeting
|
|
Date: 12-08-2013
|
|
|
|
Id: 1
|
|
Title: Dave's birthday
|
|
Date: 20-12-2013
|
|
@end example
|
|
|
|
@noindent
|
|
However, suppose that we want to share our events with other people,
|
|
@ie{} to send them event records and to incorporate their records into
|
|
our own database. In this case the @code{Id}s would collide. A good
|
|
solution is to use @code{uuids} and declare them as @code{auto}:
|
|
|
|
@example
|
|
%rec: Event
|
|
%key: Id
|
|
%type: Id uuid
|
|
%mandatory: Title Date
|
|
|
|
Id: f81d4fae-7dec-11d0-a765-00a0c91e6bf6
|
|
Title: Team meeting
|
|
Date: 12-08-2013
|
|
|
|
Id: f81d4fae-dc18-11d0-a765-a01328400a0c
|
|
Title: Dave's birthday
|
|
Date: 20-12-2013
|
|
@end example
|
|
|
|
@node Time-Stamps
|
|
@section Time-Stamps
|
|
|
|
@cindex timestamps
|
|
Auto generated dates can be used to implement automatic timestamps.
|
|
Consider for example a ``Transfer'' record set registering bank
|
|
transfers. We want to save a timestamp every time a transfer is done,
|
|
so we include an @code{%auto} for the date:
|
|
|
|
@example
|
|
%rec: Transfer
|
|
%key: Id
|
|
%type: Id int
|
|
%type: Date date
|
|
%auto: Id Date
|
|
@end example
|
|
|
|
@node Encryption
|
|
@chapter Encryption
|
|
|
|
@cindex encryption
|
|
|
|
For ethical or security reasons it is sometimes necessary that information
|
|
in a recfile should not be readable by unauthorized people.
|
|
One way to prevent a recfile from being read is to use the security features of
|
|
the operating system.
|
|
A more secure way would be to encrypt the entire recfile using a free strong encryption program
|
|
such as @uref{http://gnu.org/software/gnupg,GnuPG}.
|
|
The disadvantage of both these methods is that the entire
|
|
recfile has to be secured
|
|
when it may well be the case that only certain data need to be protected.
|
|
|
|
Recutils offers a way to encrypt specified fields in a record, whilst leaving
|
|
the rest in clear text.
|
|
|
|
@menu
|
|
* Confidential Fields:: Declaring fields as sensitive data.
|
|
* Encrypting Files:: Encrypt confidential fields.
|
|
* Decrypting Data:: Reading encrypted fields.
|
|
@end menu
|
|
|
|
@node Confidential Fields
|
|
@section Confidential Fields
|
|
|
|
@cindex @code{%confidential}
|
|
@cindex passwords
|
|
@cindex confidential data
|
|
To specify that a field should be encrypted, use the @code{%confidential}
|
|
special field.
|
|
This special field declares a set of fields as
|
|
@dfn{confidential}, meaning they contain secret data such as
|
|
passwords or personal information.
|
|
Its usage is:
|
|
|
|
@example
|
|
%confidential: @var{field1} @var{field2} @dots{} @var{fieldN}
|
|
@end example
|
|
|
|
@noindent
|
|
The field names are separated by one or more blank characters.
|
|
There can be several @code{%confidential} fields in the same record
|
|
descriptor, the effective list of confidential fields being the union
|
|
of all the entries.
|
|
|
|
@cindex encrypted fields
|
|
Declaring a field as confidential indicates that its contents must not
|
|
be stored in plain text, but encrypted with a password-based
|
|
mechanism. When the information is retrieved from the database the
|
|
confidential fields are unencrypted if the correct password is
|
|
provided. Likewise, when information is inserted in the database the
|
|
confidential fields are encrypted with some given password.
|
|
|
|
For example, consider a database of users of some service. For each
|
|
user we want to store a name, a login name, an email address and a
|
|
password. All this information is public with the obvious exception
|
|
of the password. Thus we declare the @code{Password} field as
|
|
confidential in the corresponding record descriptor:
|
|
|
|
@example
|
|
%rec: Account
|
|
%type: Name line
|
|
%type: Login line
|
|
%type: Email email
|
|
%confidential: Password
|
|
@end example
|
|
|
|
The rec format does not impose the usage of a specific encryption
|
|
algorithm, but requires that:
|
|
|
|
@itemize @minus
|
|
@item The algorithm must be password-based.
|
|
@item The value of any encrypted field shall begin with the string
|
|
@samp{encrypted-} followed by the encrypted data.
|
|
@item The encrypted data must be encoded in some ASCII encoding such
|
|
as base64.
|
|
@end itemize
|
|
|
|
The above rules assure that it is possible to determine whether a
|
|
given field is encrypted. For example, the following is an excerpt
|
|
from the account database described above. It contains an entry with
|
|
the password encrypted and another with the password unencrypted:
|
|
|
|
@example
|
|
Name: Mr. Foo
|
|
Login: foo
|
|
Email: foo@@foo.com
|
|
Password: encrypted-AAABBBCCDDDEEEFFF
|
|
|
|
Name: Mr. Bar
|
|
Login: bar
|
|
Email: bar@@bar.com
|
|
Password: secret
|
|
@end example
|
|
|
|
Unencrypted confidential fields are a data integrity error,
|
|
and utilities like @code{recfix} will report it.
|
|
@cindex integrity problems
|
|
The same utility can
|
|
be used to ``fix'' the database by massively encrypting any
|
|
unencrypted field.
|
|
|
|
Nothing prevents the usage of several passwords in the same database.
|
|
This allows the establishment of several level of securities or
|
|
security profiles. For example, we may want to store different
|
|
passwords for different online services:
|
|
|
|
@example
|
|
%rec: Account
|
|
%confidential: WebPassword ShellPassword
|
|
@end example
|
|
|
|
@noindent
|
|
We could then encrypt WebPassword entries using a password shared
|
|
among all the webmasters, and the ShellPassword entries with a more
|
|
restricted password available only to the administrator of the
|
|
machine.
|
|
|
|
Note that since the utilities only accept to specify one password at a
|
|
time different passwords cannot be specified at decryption time. This
|
|
means that in the example above the administrator would need to run
|
|
@command{recsel} twice in order to decrypt all the encrypted data in
|
|
the recfile.
|
|
|
|
The GNU recutils fully support encrypted fields. See the documentation
|
|
for @command{recsel}, @command{recins} and @command{recfix} for details on how
|
|
to operate on files containing confidential fields.
|
|
|
|
@node Encrypting Files
|
|
@section Encrypting Files
|
|
|
|
@command{recins} allows the insertion of encrypted fields in a
|
|
database. When the @option{-s} (@option{--password}) command line option is
|
|
specified in the command line any field declared as confidential in
|
|
the record descriptor will get encrypted using the given passphrase.
|
|
If the command is executed interactively and @option{-s} is not used
|
|
then the user is asked to provide a password using the terminal. For
|
|
example, the invocation:
|
|
|
|
@example
|
|
$ recins -t Account -s mypassword -f Login -v foo -f Password \
|
|
-v secret accounts.rec
|
|
@end example
|
|
|
|
@noindent
|
|
will encrypt the value of the @code{Password} field with
|
|
@code{mypassword} as long as the field is declared as confidential.
|
|
(@pxref{Confidential Fields} for details on confidential fields).
|
|
|
|
@command{recins} will issue a warning if a confidential field is
|
|
inserted in the database but no password was provided to encrypt it.
|
|
This is to avoid having unencrypted sensitive data in the recfiles.
|
|
|
|
@node Decrypting Data
|
|
@section Decrypting Data
|
|
|
|
The contents of confidential fields can be read using the
|
|
@option{-s} (@option{--password}) command line option to @command{recsel}. When
|
|
used, any selected record containing encrypted fields will try to
|
|
decrypt them with the given password. If the operation succeeds then
|
|
the output will include the unencrypted data. Otherwise the
|
|
ASCII-encoded encrypted data will be emitted.
|
|
|
|
If @command{recsel} is invoked interactively and no password is
|
|
specified with @option{-s}, the user will be asked for a password in
|
|
case one is needed. No echo of the password will appear in the screen.
|
|
The provided password will be used to decrypt all confidential fields
|
|
as if it was specified with @option{-s}.
|
|
|
|
For example, consider the following database storing information about
|
|
the user accounts of some online service. Each entry stores a login,
|
|
a full name, email and a password. The password is declared as
|
|
confidential:
|
|
|
|
@example
|
|
%rec: Account
|
|
%key: Login
|
|
%confidential: Password
|
|
|
|
Login: foo
|
|
Name: Mr. Foo
|
|
Email: foo@@foo.com
|
|
Password: encrypted-AAABBBCCCDDD
|
|
|
|
Login: bar
|
|
Name: Ms. Bar
|
|
Email: bar@@bar.org
|
|
Password: encrypted-XXXYYYZZZUUU
|
|
@end example
|
|
|
|
@noindent
|
|
If we use @command{recsel} to get a list of records of type
|
|
@code{Account} without specifying a password, or if the wrong password
|
|
was specified in interactive mode, then we would get the following
|
|
output with the encrypted values:
|
|
|
|
@example
|
|
$ cat accounts.rec | recsel -t Account -p Login,Password
|
|
Login: foo
|
|
Password: encrypted-AAABBBCCCDDD
|
|
|
|
Login: bar
|
|
Password: encrypted-XXXYYYZZZUUU
|
|
@end example
|
|
|
|
@noindent
|
|
If we specify a password and both entries were encrypted using that
|
|
password, we would get the unencrypted values:
|
|
|
|
@example
|
|
$ recsel -t Account -s secret -p Login,Password accounts.rec
|
|
Login: foo
|
|
Password: foosecret
|
|
|
|
Login: bar
|
|
Password: barsecret
|
|
@end example
|
|
|
|
As mentioned above, a confidential field may be encrypted with
|
|
different passwords in different records (@pxref{Confidential Fields}).
|
|
For example,
|
|
we may have an entry in our database with data about the account of
|
|
the administrator of the online service. In that case we might want
|
|
to store the password associated with that account using a
|
|
different password than that for users. In that case the output of
|
|
the last command
|
|
would have been:
|
|
|
|
@example
|
|
$ recsel -t Account -s secret -p Login,Password accounts.rec
|
|
Login: foo
|
|
Password: foosecret
|
|
|
|
Login: bar
|
|
Password: barsecret
|
|
|
|
Login: admin
|
|
Password: encrypted-TTTVVVBBBNNN
|
|
@end example
|
|
|
|
@noindent
|
|
We would need to invoke @command{recsel} with the password used to
|
|
encrypt the admin entry in order to read it back unencrypted.
|
|
|
|
@node Generating Reports
|
|
@chapter Generating Reports
|
|
|
|
@cindex reports
|
|
Having a list of names and addresses, one might want to use this list
|
|
to address envelopes
|
|
(say, to send annual greeting cards).
|
|
Since addresses are normally written on several lines, it would be appropriate
|
|
then to split the @code{Address} field values across multiple lines as described in
|
|
@ref{Fields}.
|
|
Suitable text can now be obtained thus:
|
|
|
|
@example
|
|
$ recsel -t Person -j Abode -P Name,Abode_Address acquaintances.rec
|
|
Charles Spencer
|
|
2 Serpe Rise,
|
|
Little Worning,
|
|
SURREY
|
|
|
|
Dirk Spencer
|
|
2 Serpe Rise,
|
|
Little Worning,
|
|
SURREY
|
|
|
|
Ernest Wright
|
|
1 Wanter Rise,
|
|
Greater Inncombe,
|
|
BUCKS
|
|
@end example
|
|
|
|
A business enterprise might want to go one step further and generate letters
|
|
(such as an advertisement or a recall notice) to customers.
|
|
Since @command{recsel} merely selects records and fields from record sets, on
|
|
its own it cannot do this; so
|
|
there is another command designed for this purpose, called @command{recfmt}.
|
|
@cindex @command{recfmt}
|
|
@cindex templates
|
|
This command uses a @dfn{template} which defines the general form of the
|
|
desired output.
|
|
A letter template might look as follows:
|
|
@example
|
|
@{@{Name@}@}
|
|
@{@{Abode_Address@}@}
|
|
|
|
Dear @{@{Name@}@},
|
|
|
|
Re: Special offer for January
|
|
|
|
We are delighted to be able to offer you a 95% discount on all car and
|
|
truck hire contracts between 1 January and 2 February. Please call us
|
|
to take advantage of this offer.
|
|
|
|
Yours sincerely,
|
|
|
|
|
|
Karen van Rental (CEO)
|
|
^L
|
|
@end example
|
|
|
|
It is best to place such a template into a file, so that you can edit it
|
|
as you wish.
|
|
Notice the instances of double braces enclosing a field name, @eg{} @code{@{@{Name@}@}}.
|
|
These are called @dfn{slots} and indicate places where the respective field's
|
|
value should be placed.
|
|
@cindex slots
|
|
Let's assume this template is in a file called @file{offer.templ}.
|
|
We can then pipe the output from @command{recsel} into @command{recfmt} in order
|
|
as follows:
|
|
|
|
@example
|
|
$ recsel -t Person -j Abode acquaintances.rec | recfmt -f offer.templ
|
|
Charles Spencer
|
|
2 Serpe Rise,
|
|
Little Worning,
|
|
SURREY
|
|
|
|
Dear Charles Spencer,
|
|
|
|
Re: Special offer for January
|
|
|
|
We are delighted to be able to offer you a 95% discount on all car and
|
|
.
|
|
.
|
|
.
|
|
@end example
|
|
|
|
@noindent For each record that @command{recsel} selects, one copy of
|
|
@file{offer.templ} will be generated. Each slot will be replaced
|
|
with the field value corresponding to the field name in the slot.
|
|
|
|
@menu
|
|
* Templates:: Formatted output.
|
|
@end menu
|
|
|
|
@node Templates
|
|
@section Templates
|
|
|
|
@cindex templates
|
|
A recfmt template is a text string that may contain @dfn{template
|
|
slots}. Those slots are substituted in the template using the
|
|
information of a given record. Any text that is not within a slot is
|
|
copied literally to the output.
|
|
|
|
Slots are written surrounded by double curly braces, like:
|
|
|
|
@example
|
|
@{@{@dots{}@}@}
|
|
@end example
|
|
|
|
Slots contain selection expressions, that are executed every time the
|
|
template is applied to a record. The slot is then replaced by the
|
|
string representation of the value returned by the expression.
|
|
|
|
For example, consider the following template:
|
|
|
|
@example
|
|
Task @{@{Id@}@}: @{@{Summary@}@}
|
|
------------------------
|
|
@{@{Description@}@}
|
|
--
|
|
Created at @{@{CreatedAt@}@}
|
|
@end example
|
|
|
|
@noindent
|
|
When applied to the following record:
|
|
|
|
@example
|
|
Id: 123
|
|
Summary: Fix recfmt.
|
|
CreatedAt: 12 December 2010
|
|
Description:
|
|
+ The recfmt tool shall be fixed, because right
|
|
+ now it is leaking 200 megabytes per processed record.
|
|
@end example
|
|
|
|
@noindent
|
|
The result is:
|
|
|
|
@example
|
|
Task 123: Fix recfmt.
|
|
------------------------
|
|
The recfmt tool shall be fixed, because right
|
|
now it is leaking 200 megabytes per processed record.
|
|
--
|
|
Created at 12 December 2010
|
|
@end example
|
|
|
|
You can use any selection expression in the slots, including
|
|
conditionals and string concatenation.
|
|
|
|
@node Interoperability
|
|
@chapter Interoperability
|
|
|
|
Included in the recutils package are a number of utilities to assist
|
|
in the creation
|
|
of recfiles using data which already exists in other formats,
|
|
and for exporting data from recfiles so that it can be used in other applications.
|
|
|
|
@menu
|
|
* CSV Files:: Converting recfiles to/from csv files.
|
|
* Importing MDB Files:: Importing MS Access Databases.
|
|
@end menu
|
|
|
|
@node CSV Files
|
|
@section CSV Files
|
|
|
|
@cindex csv
|
|
@cindex comma separated values
|
|
|
|
Many applications are able to read and write files containing so-called
|
|
``comma separated values''.
|
|
Such files generally contain tabular data where the columns are separated
|
|
by commas and the rows by line feed and/or carriage return characters.
|
|
Although record sets are not tables, tables can be easily emulated
|
|
using records having the same fields in the same order. For example:
|
|
|
|
@example
|
|
a: value
|
|
b: value
|
|
c: value
|
|
|
|
a: value
|
|
b: value
|
|
c: value
|
|
|
|
@dots{}
|
|
@end example
|
|
|
|
In several respects records are more flexible than tables:
|
|
|
|
@itemize @minus
|
|
@item Fields can appear in a different order in several records.
|
|
@item There can be several fields with the same name in a single record.
|
|
@item Records can differ in the number of fields.
|
|
@end itemize
|
|
|
|
It is evident that records, such as those in recfiles, are a more
|
|
general structure than comma separated values.
|
|
This means that when converting from csv files to recfiles, certain
|
|
decisions need to be made.
|
|
The @code{rec2csv} utility (@pxref{Invoking rec2csv})
|
|
implements an algorithm to deal with this problem
|
|
and generate a table that the user expects.
|
|
|
|
The algorithm works as follows:
|
|
|
|
@enumerate
|
|
@item
|
|
The utility first scans the specified
|
|
record set, building a list with the names that will become the table
|
|
header.
|
|
|
|
@item
|
|
For each field, a header is added with the form:
|
|
|
|
@example
|
|
FIELDNAME[_@var{n}]
|
|
@end example
|
|
|
|
@noindent
|
|
where @var{n} is a number in the range @code{2..inf} and is the ``index'' of
|
|
the field in its containing record plus one.
|
|
For example, consider
|
|
the following record set:
|
|
|
|
@example
|
|
a: a1
|
|
b: b11
|
|
b: b12
|
|
c: c1
|
|
|
|
a: a2
|
|
b: b2
|
|
d: d2
|
|
@end example
|
|
|
|
The corresponding list of headers being:
|
|
|
|
@example
|
|
a b b_2 c a b d
|
|
@end example
|
|
|
|
@item
|
|
Then duplicates are removed:
|
|
|
|
@example
|
|
a b b_2 c d
|
|
@end example
|
|
|
|
@item
|
|
The resulting list of headers is then used to build the table in the
|
|
generated csv file.
|
|
@end enumerate
|
|
|
|
In the above example the result would be
|
|
|
|
@example
|
|
"a","b","b_2","c","d"
|
|
"a1","b11","b12","c1",
|
|
"a2","b2",,,"d2"
|
|
@end example
|
|
|
|
As shown, missing fields are implemented as empty columns in the generated
|
|
csv.
|
|
|
|
@node Importing MDB Files
|
|
@section Importing MDB Files
|
|
|
|
Access files (@dfn{mdb files}) are collections of several relations,
|
|
also known as tables. Tables can be either @dfn{user tables} storing
|
|
user data, or @dfn{system tables} storing information such as forms,
|
|
queries or the relationships between the tables.
|
|
|
|
It is possible to get a listing with the names of all tables stored in
|
|
a mdb file by calling @command{mdb2rec} in the following way:
|
|
|
|
@example
|
|
$ mdb2rec -l sales.mdb
|
|
Customers
|
|
Products
|
|
Orders
|
|
@end example
|
|
|
|
So @file{sales.mdb} stores user information in the tables Customers,
|
|
Products and Orders. If we want to include system tables in the
|
|
listing we can use the @samp{-s} command line option:
|
|
|
|
@example
|
|
$ mdb2rec -s -l sales.mdb
|
|
MSysObjects
|
|
MSysACEs
|
|
MSysQueries
|
|
MSysRelationships
|
|
Customers
|
|
Products
|
|
Orders
|
|
@end example
|
|
|
|
The tables with names starting with @command{MSys} are system tables.
|
|
The data stored in those tables is either not relevant to the recutils
|
|
user (used by the Access program to create forms and the like) or is
|
|
used in an indirect way by @command{mdb2rec} (such as the information
|
|
from MSysRelationships).
|
|
|
|
Let's read some data from the @file{mdb} file. We can get the
|
|
relation of Products in rec format:
|
|
|
|
@example
|
|
$ mdb2rec sales.mdb Products
|
|
%rec: Products
|
|
%type: ProductID int
|
|
%type: ProductName size 80
|
|
%type: Discontinued bool
|
|
|
|
ProductID: 1
|
|
ProductName: GNU generation T-shirt
|
|
Discontinued: 0
|
|
|
|
@dots{}
|
|
@end example
|
|
|
|
A @dfn{record descriptor} is created for the record set containing the
|
|
generated records, called Products. As seen in the example, @command{mdb2rec} is
|
|
able to generate type information for the fields. The list of
|
|
customers is similar:
|
|
|
|
@example
|
|
$ mdb2rec sales.mdb Customers
|
|
%rec: Customers
|
|
%type: CustomerID size 4
|
|
%type: CompanyName size 80
|
|
%type: ContactName size 60
|
|
|
|
CustomerID: GSOFT
|
|
CompanyName: GNU Soft
|
|
ContactName: Jose E. Marchesi
|
|
|
|
@dots{}
|
|
@end example
|
|
|
|
If no table is specified in the invocation to @command{mdb2rec} all
|
|
the tables in the file are processed, with the exception of the system
|
|
tables, which requires @samp{-s} to be used:
|
|
|
|
@example
|
|
$ mdb2rec sales.mdb
|
|
%rec: Products
|
|
@dots{}
|
|
|
|
%rec: Customers
|
|
@dots{}
|
|
|
|
%rec: Orders
|
|
@dots{}
|
|
@end example
|
|
|
|
@node Bash Builtins
|
|
@chapter Bash Builtins
|
|
|
|
@cindex bash
|
|
@cindex interactive use
|
|
@cindex shell
|
|
The command-line utilities described in @ref{Invoking the Utilities} are
|
|
designed to be used interactively in the shell.
|
|
Together, and often
|
|
combined with the standard shell utilities, they provide a quite
|
|
complete user interface.
|
|
However, the user's experience can be greatly
|
|
improved by a closer integration between the recutils and the shell.
|
|
The following sections describe several extensions for @command{bash},
|
|
the GNU shell (@pxref{Top,,, bash, The GNU Bourne-Again SHell}).
|
|
These extensions make the shell ``aware'' of the recutils.
|
|
|
|
As with any bash built-in, help is available in the command line using
|
|
the @command{help} command. For example:
|
|
|
|
@example
|
|
$ help readrec
|
|
@end example
|
|
|
|
If you installed recutils using a binary package in a GNU/Linux
|
|
distribution, odds are that the built-in commands described in this
|
|
chapter are already available to you. Otherwise (you get a ``command
|
|
not found'' or similar error) you may have to register the built-in
|
|
commands with your bash. This is very easy using the @command{enable}
|
|
bash command. The registering command for readrec would be:
|
|
|
|
@example
|
|
$ enable -f readrec.so readrec
|
|
@end example
|
|
|
|
Note however that some systems require the full path to
|
|
@file{readrec.so} in order for this command to work.
|
|
|
|
|
|
@menu
|
|
* readrec:: Exporting the contents of records to the shell.
|
|
@end menu
|
|
|
|
@node readrec
|
|
@section readrec
|
|
|
|
The bash built-in @command{read}, when invoked with no options,
|
|
consumes one line from standard input and makes it available in
|
|
the predefined @code{REPLY} environment variable, or any other
|
|
variable whose name is passed as an argument. This allows processing
|
|
data structured in lines in a quite natural way. For example, the
|
|
following program prints the third field of each line, with fields
|
|
separated by commas, until standard input is exhausted:
|
|
|
|
@example
|
|
# Process one line at a time.
|
|
while read
|
|
do
|
|
echo "The third field is " `echo $REPLY | cut -d, -f 2`
|
|
done
|
|
@end example
|
|
|
|
However, @command{read} is not very useful when it comes to
|
|
processing recutils records in the shell. Even though it is
|
|
possible to customize the character used by @command{read} to split
|
|
the input into records, we would need to ignore the empty records in
|
|
the likely case of more than one empty line separating records.
|
|
Also, we would need to use @command{recsel} to access to the record
|
|
fields. Too complicated!
|
|
|
|
Thus, the @command{readrec} bash built-in is similar to @command{read} with
|
|
the difference that it reads records instead of lines. It also
|
|
``exports'' the contents of the record to the user as the values of
|
|
several environment variables:
|
|
|
|
@itemize @minus
|
|
@item @code{REPLY_REC} is set to the record read from standard input.
|
|
@item A set of variables @code{FIELD} named after each field found in
|
|
the record are set to the (decoded) value of the fields found in the
|
|
input record. When several fields with the same name are found in the
|
|
input record then a bash array is created.
|
|
@end itemize
|
|
|
|
Consider for example the following simple database containing
|
|
contacts information:
|
|
|
|
@example
|
|
Name: Mr. Foo
|
|
Email: foo@@bar.com
|
|
Email: bar@@baz.net
|
|
Checked: no
|
|
|
|
Name: Mr. Bar
|
|
Email: bar@@foo.com
|
|
Telephone: 999666000
|
|
Checked: yes
|
|
@end example
|
|
|
|
@noindent
|
|
We would like to write some shell code to send an email to all the
|
|
contacts, but only if the contact has not been checked before,
|
|
@ie{} the @code{Checked} field contains @code{no}. The following code
|
|
snippet would do the job nicely using @command{readrec}:
|
|
|
|
@example
|
|
recsel contacts.rec | while readrec
|
|
do
|
|
if [ $Checked = "no" ]
|
|
then
|
|
mail -s "You are being checked." $@{Email[0]@} < email.txt
|
|
recset -e "Email = '$Email'" -f Checked -S yes contacts.rec
|
|
sleep 1
|
|
fi
|
|
done
|
|
@end example
|
|
|
|
@noindent
|
|
Note the usage of the bash array when accessing the primary email
|
|
address of each contact. Note also that we update each contact to
|
|
figure as ``checked'', using @command{recset}, so she won't get
|
|
pestered again the next time the
|
|
script is run.
|
|
|
|
@node Invoking the Utilities
|
|
@chapter Invoking the Utilities
|
|
|
|
Certain options are available in all of these programs. Rather than
|
|
writing identical descriptions for each of the programs, they are
|
|
listed here.
|
|
|
|
@anchor{Common Options}
|
|
@table @samp
|
|
@item --version
|
|
Print the version number, then exit successfully.
|
|
@item --help
|
|
Print a help message, then exit successfully.
|
|
@item --
|
|
Delimit the option list. Later arguments, if any, are treated as
|
|
operands even if they begin with @option{-}. For example,
|
|
@code{recsel -- -p} reads from the file named @file{-p}.
|
|
@end table
|
|
|
|
@menu
|
|
* Invoking recinf:: Printing information about rec files.
|
|
* Invoking recsel:: Selecting records.
|
|
* Invoking recins:: Inserting records.
|
|
* Invoking recdel:: Deleting records.
|
|
* Invoking recset:: Managing fields.
|
|
* Invoking recfix:: Fixing broken rec files, and diagnostics.
|
|
* Invoking recfmt:: Formatting records using templates.
|
|
* Invoking csv2rec:: Converting csv data into rec data.
|
|
* Invoking rec2csv:: Converting rec data into csv data.
|
|
* Invoking mdb2rec:: Converting mdb files into rec files.
|
|
@end menu
|
|
|
|
@node Invoking recinf
|
|
@section Invoking recinf
|
|
@cindex @command{recinf}
|
|
|
|
@command{recinf} reads the given rec files (or the data from
|
|
standard input if no file is specified) and prints a summary of the
|
|
record types contained in the input.
|
|
|
|
Synopsis:
|
|
|
|
@example
|
|
recinf [@var{option}]@dots{} [@var{file}]@dots{}
|
|
@end example
|
|
|
|
The default behavior is to emit a line per record type in
|
|
the input containing its name and the number of records of that type:
|
|
|
|
@example
|
|
$ recinf hackers.rec tasks.rec
|
|
25 Hacker
|
|
102 Task
|
|
@end example
|
|
|
|
If the input contains anonymous records, @ie{} records that are before
|
|
the first record descriptor, the corresponding output line won't have
|
|
a type name:
|
|
|
|
@example
|
|
$ recinf data.rec
|
|
10
|
|
@end example
|
|
|
|
In addition to the common options described earlier the program accepts the following options.
|
|
|
|
@table @samp
|
|
@item -t @var{type}
|
|
@itemx --type=@var{type}
|
|
Select records of a given type only.
|
|
@item -d
|
|
@itemx --descriptor
|
|
Print all the record descriptors present in the file.
|
|
@item -n
|
|
@itemx --names-only
|
|
Output just the names of the record types found in the input. If the
|
|
input contains only anonymous records then output nothing.
|
|
@item -S
|
|
@itemx --print-sexps
|
|
Print the data in the form of sexps (Lisp expressions) instead of rec
|
|
format. This option can be useful for, of course, Lisp programs.
|
|
@end table
|
|
|
|
@node Invoking recsel
|
|
@section Invoking recsel
|
|
@cindex @command{recsel}
|
|
|
|
@cindex selecting records
|
|
@command{recsel} reads the given rec files (or the data in the
|
|
standard input if no file is specified) and prints out records (or
|
|
part of records) based upon some criteria specified by the user.
|
|
|
|
@command{recsel} searches rec files for records satisfying certain
|
|
criteria. Synopsis:
|
|
|
|
@example
|
|
recsel [@var{option}]@dots{} \
|
|
[-n @var{indexes} | -e @var{record_expr} | -q @var{str} | -m @var{num}] \
|
|
[-c | (-p|-P|-R) @var{field_expr}] \
|
|
[@var{file}]@dots{}
|
|
@end example
|
|
|
|
If no @var{file} is specified then the command acts like a filter, getting
|
|
the data from standard input and writing the result to
|
|
standard output.
|
|
|
|
In addition to the common options described earlier (@pxref{Common
|
|
Options}) the program accepts the following options.
|
|
|
|
@noindent
|
|
The following @dfn{global options} are available.
|
|
|
|
@table @samp
|
|
@item -i
|
|
@itemx --case-insensitive
|
|
Make string matching case-insensitive in selection expressions.
|
|
@cindex case, in selection expressions
|
|
@item -C
|
|
@item --collapse
|
|
Do not section the result in records with newlines.
|
|
@item -d
|
|
@itemx --include-descriptors
|
|
Print record descriptors along with the matched records.
|
|
@item -s @var{secret}
|
|
@itemx --password=@var{secret}
|
|
Try to decrypt confidential fields with the given password.
|
|
@item -S
|
|
@itemx --sort=@var{fields}
|
|
@cindex sorting
|
|
Sort the output by the comma-separated list of field names,
|
|
@var{fields}. This option takes precedence over any sorting criteria
|
|
specified in the corresponding record descriptor with @code{%sort}.
|
|
@item -U
|
|
@itemx --uniq
|
|
Remove duplicated fields in the output records. Fields are
|
|
duplicated if they have the same field name
|
|
and the same value.
|
|
@item -G
|
|
@itemx --group-by=@var{fields}
|
|
Group the output records by the provided comma-separated list of
|
|
@var{fields}. Grouping is performed before sorting.
|
|
@end table
|
|
|
|
The @dfn{selection options} are used to select a subset of
|
|
the records in the input.
|
|
|
|
@table @samp
|
|
@item -n @var{indexes}
|
|
@item --number=@var{indexes}
|
|
Match the records occupying the given positions in its record set.
|
|
@var{indexes} must be a comma-separated list of numbers or ranges, with
|
|
ranges being two numbers separated with dashes. For example, the
|
|
following list denotes the first, the third, the fourth and all
|
|
records up to the tenth: @samp{-n 0,2,4-9}.
|
|
@item -e @var{expr}
|
|
@itemx --expression=@var{expr}
|
|
A record selection expression (@pxref{Selection Expressions}). Only
|
|
the records matched by the expression will be taken into account to
|
|
compute the output.
|
|
@item -q @var{str}
|
|
@itemx --quick=@var{str}
|
|
Select records having a field whose value contains the substring
|
|
@var{str}.
|
|
@item -m @var{num}
|
|
@itemx --random=@var{num}
|
|
Select @var{num} random records. If @var{num} is zero then select all
|
|
the records.
|
|
@item -t @var{type}
|
|
@itemx --type=@var{type}
|
|
Select records of a given type only.
|
|
@item -j @var{field}
|
|
@itemx --join=@var{field}
|
|
Perform an inner join of the record set selected by @option{-t} and
|
|
the record set for which @var{field} is a foreign key. @var{field}
|
|
must be a field declared with type @code{rec} and thus must be a
|
|
foreign key. If a join is performed then any selection expression and
|
|
field expression operate on the joined record sets.
|
|
@end table
|
|
|
|
The @dfn{output options} are used to determine what information about
|
|
the selected records to display to the user, and how to display it.
|
|
|
|
@table @samp
|
|
@item -p @var{name_list}
|
|
@itemx --print=@var{name_list}
|
|
List of fields to print for each record. @var{name_list} is a
|
|
list of field names separated by commas. For example:
|
|
@example
|
|
-p Name,Email
|
|
@end example
|
|
|
|
@noindent
|
|
means to print the Name and the Email of every matching record, both
|
|
the field names and values.
|
|
|
|
If this option is not specified then all the fields of the matching
|
|
records are printed to standard output.
|
|
@item -P @var{name_list}
|
|
@itemx --print-values=@var{name_list}
|
|
Same as @samp{-p}, but print only the values of the selected fields.
|
|
@item -R @var{name_list}
|
|
@itemx --print-row=@var{name_list}
|
|
Same as @samp{-P}, but print the values separated by single
|
|
spaces instead of newlines.
|
|
@item -c
|
|
@itemx --count
|
|
If this option is specified then @command{recsel} will print the number of
|
|
matching records instead of the records themselves. This option is
|
|
incompatible with @option{-p}, @option{-P} and @option{-R}.
|
|
@end table
|
|
|
|
This @dfn{special option} is available to ease the communication
|
|
between the recutils and other programs, namely Lisp interpreters.
|
|
This option is not intended to be used by human operators.
|
|
|
|
@table @samp
|
|
@item --print-sexps
|
|
Print the data using sexps instead of rec format.
|
|
@end table
|
|
|
|
@node Invoking recins
|
|
@section Invoking recins
|
|
@cindex @command{recins}
|
|
@cindex inserting new records
|
|
|
|
@command{recins} adds new records to a rec file or to rec data read
|
|
from standard input. Synopsis:
|
|
|
|
@example
|
|
recins [@var{option}]@dots{} [-t @var{type}] \
|
|
[-n @var{indexes} | -e @var{record_expr} | -q @var{str} | -m @var{num}] \
|
|
[( -f @var{str} -v @var{str}]|[-r @var{recdata} )]@dots{} \
|
|
[@var{file}]
|
|
@end example
|
|
|
|
The new record to be inserted by the command is constructed by
|
|
using pairs of @samp{-f} and @samp{-v} options, or @samp{-r}. Each pair defines a
|
|
field. The order of the parameters is significant.
|
|
|
|
If no @var{file} is specified then the command acts like a filter, getting
|
|
the data from standard input and writing the result to
|
|
standard output.
|
|
|
|
In addition to the common options described earlier (@pxref{Common
|
|
Options}) the program accepts the following options.
|
|
|
|
@table @samp
|
|
@item -t
|
|
@itemx --type=@var{expr}
|
|
The type of the new record. If there is a record set in the input
|
|
data matching this type then the new record is added there. Otherwise
|
|
a new record set is created. If this parameter is not specified then
|
|
the new record is anonymous.
|
|
@item -f
|
|
@itemx --field=@var{name}
|
|
Declares the name of a field. This option must be followed by a
|
|
@samp{-v}.
|
|
@item -v
|
|
@itemx --value=@var{value}
|
|
The value of the field being defined.
|
|
@item -r
|
|
@itemx --record=@var{value}
|
|
Add the fields of the record in @var{value}. This option can be
|
|
intermixed with @samp{-f @dots{} -v} pairs.
|
|
@item -s
|
|
@itemx --password
|
|
Encrypt confidential fields with the given password.
|
|
@item --no-external
|
|
Don't use external record descriptors.
|
|
@item --verbose
|
|
Be verbose when reporting integrity problems.
|
|
@item --no-auto
|
|
Don't generate @dfn{auto} fields. @xref{Auto-Generated Fields}.
|
|
@end table
|
|
|
|
Record selection arguments are supported too. If they are used
|
|
then @command{recins} uses ``replacement mode'': instead of
|
|
appending the new record, matched records are replaced by copies of
|
|
the provided record. The selection arguments are the same as in
|
|
@command{recsel}:
|
|
|
|
@table @samp
|
|
@item -n @var{indexes}
|
|
@item --number=@var{indexes}
|
|
Match the records occupying the given positions in its record set.
|
|
@var{indexes} must be a comma-separated list of numbers or ranges, the
|
|
ranges being two numbers separated with dashes. For example, the
|
|
following list denotes the first, the third, the fourth and all
|
|
records up to the tenth: @code{-n 0,2,4-9}.
|
|
@item -e @var{record_expr}
|
|
@itemx --expression=@var{expr}
|
|
A record selection expression (@pxref{Selection Expressions}).
|
|
Matching records will get replaced.
|
|
@item -q @var{str}
|
|
@itemx --quick=@var{str}
|
|
Remove records having a field whose value contains the substring
|
|
@var{str}.
|
|
@item -m @var{num}
|
|
@itemx --random=@var{num}
|
|
Select @var{num} random records. If @var{num} is zero then all
|
|
records are selected, @ie{} no replace mode is activated.
|
|
@item -i
|
|
@itemx --case-insensitive
|
|
Make strings case-insensitive in selection expressions.
|
|
@cindex case, in selection expressions
|
|
@item --force
|
|
Insert the requested record even in potentially dangerous situations,
|
|
such as when the data integrity of the database is compromised.
|
|
@end table
|
|
|
|
@node Invoking recdel
|
|
@section Invoking recdel
|
|
@cindex @command{recdel}
|
|
@cindex deleting records
|
|
|
|
@command{recdel} removes records from a rec file, or from rec data
|
|
read from standard input. Synopsis:
|
|
|
|
@example
|
|
recdel [OPTIONS]@dots{} [-t @var{type}] \
|
|
[-n @var{indexes} | -e @var{record_expr} | -q @var{str} | -m @var{num}] \
|
|
[@var{file}]
|
|
@end example
|
|
|
|
If no @var{file} is specified then the command acts like a filter,
|
|
getting the data from standard input and writing the result to
|
|
standard output.
|
|
|
|
In addition to the common options described earlier (@pxref{Common
|
|
Options}) the program accepts the following options.
|
|
|
|
@table @samp
|
|
@item -t
|
|
@itemx --type=@var{expr}
|
|
Remove records of the given type. If this parameter is not specified
|
|
then records of any type will be removed.
|
|
@item -n @var{indexes}
|
|
@item --number=@var{indexes}
|
|
Match the records occupying the given positions in its record set.
|
|
@var{indexes} must be a comma-separated list of numbers or ranges, the
|
|
ranges being two numbers separated with dashes. For example, the
|
|
following list denotes the first, the third, the fourth and all
|
|
records up to the tenth: @code{-n 0,2,4-9}.
|
|
@item -e @var{record_expr}
|
|
@itemx --expression=@var{expr}
|
|
A record selection expression (@pxref{Selection Expressions}). Only
|
|
the records matched by the expression will be removed from the file.
|
|
@item -q @var{str}
|
|
@itemx --quick=@var{str}
|
|
Remove records having a field whose value contains the substring
|
|
@var{str}.
|
|
@item -m @var{num}
|
|
@itemx --random=@var{num}
|
|
Remove @var{num} random records. If @var{num} is zero then remove all
|
|
the records.
|
|
@item -c
|
|
@itemx --comment
|
|
Comment the matching records out instead of removing them.
|
|
@item --force
|
|
Delete even in potentially dangerous situations, such as a request
|
|
to delete all the records of some type.
|
|
@item --no-external
|
|
Don't use external record descriptors.
|
|
@item -i
|
|
@itemx --case-insensitive
|
|
Make strings case-insensitive in selection expressions.
|
|
@item --verbose
|
|
Be verbose when reporting integrity problems.
|
|
@end table
|
|
|
|
@node Invoking recset
|
|
@section Invoking recset
|
|
@cindex @command{recset}
|
|
@cindex editing fields
|
|
|
|
@command{recset} manipulates the fields of records in a rec file, or
|
|
rec data read from standard input. Synopsis:
|
|
|
|
@example
|
|
recset [@var{option}]@dots{} [@var{file}]@dots{}
|
|
@end example
|
|
|
|
If no @var{file} is specified then the command acts like a filter,
|
|
getting the data from standard input and writing the result to
|
|
standard output.
|
|
|
|
In addition to the common options described earlier (@pxref{Common
|
|
Options}) the program accepts the following options.
|
|
|
|
Record selection options:
|
|
|
|
@table @samp
|
|
@item -i
|
|
@itemx --case-insensitive
|
|
Make strings case-insensitive in selection expressions.
|
|
@item -t
|
|
@itemx --type=@var{expr}
|
|
Operate on the records of the given type. If this parameter is not
|
|
specified then records of any type will be affected.
|
|
@item -n @var{indexes}
|
|
@item --number=@var{indexes}
|
|
Operate on the records occupying the given positions in its record
|
|
set. @var{indexes} must be a comma-separated list of numbers or
|
|
ranges, the ranges being two numbers separated with dashes. For
|
|
example, the following list denotes the first, the third, the fourth
|
|
and all records up to the tenth: @code{-n 0,2,4-9}.
|
|
@item -e @var{expr}
|
|
@itemx --expression=@var{expr}
|
|
A record selection expression (@pxref{Selection Expressions}). Only
|
|
the records matched by the expression will be processed.
|
|
@item -q @var{str}
|
|
@itemx --quick=@var{str}
|
|
Operate on records having a field whose value contains the substring
|
|
@var{str}.
|
|
@item -m @var{num}
|
|
@itemx --random=@var{num}
|
|
Operate on @var{num} random records. If @var{num} is zero then
|
|
operate on all the records.
|
|
@end table
|
|
|
|
Field selection options:
|
|
|
|
@table @samp
|
|
@item -f
|
|
@itemx --fields=@var{FEX}
|
|
Field selection expression (@pxref{Field Expressions}) to select the
|
|
fields to operate.
|
|
@end table
|
|
|
|
Actions:
|
|
|
|
@table @samp
|
|
@item -s
|
|
@itemx --set=@var{value}
|
|
Set the value of the selected fields to @var{value}.
|
|
@item -a
|
|
@itemx --add=@var{value}
|
|
Add a new field to the selected record with value @var{value}.
|
|
@item -S
|
|
@itemx --set-add=@var{value}
|
|
Set the value of the selected fields to @var{value}. If some of the
|
|
fields don't exist in a record, append it with the specified value.
|
|
@item -r
|
|
@itemx --rename=@var{value}
|
|
Rename a field; @var{value} must be a valid field name. The field
|
|
expression associated with this action must contain a single field
|
|
name and an optional subscript. If an entire record set is selected
|
|
then the field is renamed in the record descriptor as well.
|
|
@item -d
|
|
@itemx --delete
|
|
Delete the selected fields in the selected records.
|
|
@item -c
|
|
@itemx --comment
|
|
Comment out the selected fields in the selected records.
|
|
@item --no-external
|
|
Don't use external record descriptors.
|
|
@item --verbose
|
|
Be verbose when reporting integrity problems.
|
|
@item --force
|
|
Perform the requested operation even in potentially dangerous
|
|
situations, or when the integrity of the data stored in the file is
|
|
affected.
|
|
@end table
|
|
|
|
@node Invoking recfix
|
|
@section Invoking recfix
|
|
@cindex @command{recfix}
|
|
@cindex checking recfiles
|
|
@cindex integrity, checking
|
|
|
|
@command{recfix} checks and fixes rec files. Synopsis:
|
|
|
|
@example
|
|
recfix [@var{option}]@dots{} [@var{operation}] [@var{op_option}]@dots{} [@var{file}]
|
|
@end example
|
|
|
|
If no @var{file} is specified then the command acts like a filter,
|
|
getting the data from standard input and writing the result to
|
|
standard output.
|
|
|
|
In addition to the common options described earlier (@pxref{Common
|
|
Options}) the program accepts the following global options.
|
|
|
|
@table @samp
|
|
@item --no-external
|
|
Don't use external record descriptors.
|
|
@end table
|
|
|
|
The effect of running @command{recfix} depends on the operation it
|
|
performs. The operation mode is selected by using one of the
|
|
following options.
|
|
|
|
@table @samp
|
|
@item --check
|
|
Check the integrity of the database contained in the file, printing
|
|
diagnostics messages in case something is not right. This is the
|
|
default operation.
|
|
@item --sort
|
|
Perform a physical sort of all the records contained in the file (or
|
|
standard input) after checking for its integrity. The sorting
|
|
criteria are provided by the @code{%sort} special field, if any. If
|
|
there is an integrity failure the sorting is not performed.
|
|
@cindex sorting
|
|
|
|
This is a destructive operation.
|
|
@item --decrypt
|
|
@itemx --encrypt
|
|
Decrypt (encrypt) all the (non-)encrypted fields in the database which are marked
|
|
as confidential. This operation requires a password. If no password
|
|
is specified with @option{-s} and the program is run in a terminal, a
|
|
prompt is given to get the password from the user.
|
|
|
|
If encryption is performed on a file having encrypted fields, the
|
|
operation will fail unless @samp{--force} is used.
|
|
|
|
These are destructive operations.
|
|
@item --auto
|
|
Insert auto-generated fields as appropriate in the records which are
|
|
missing them.
|
|
|
|
This is a destructive operation.
|
|
@end table
|
|
|
|
As described above, some operations make use of these additional options:
|
|
|
|
@table @samp
|
|
@item -s @var{secret}
|
|
@itemx --password=@var{secret}
|
|
Password used to encrypt or decrypt fields.
|
|
@item --force
|
|
Force potentially dangerous operations.
|
|
@end table
|
|
|
|
@node Invoking recfmt
|
|
@section Invoking recfmt
|
|
@cindex @command{recfmt}
|
|
@cindex formatted output
|
|
|
|
@command{recfmt} formats records using templates. Synopsis:
|
|
|
|
@example
|
|
recfmt [@var{option}]@dots{} [@var{template}]
|
|
@end example
|
|
|
|
This program always works as a filter, getting the data from the
|
|
standard input and writing the result to standard output.
|
|
|
|
In addition to the common options described earlier (@pxref{Common
|
|
Options}) the program accepts the following options.
|
|
|
|
@table @samp
|
|
@item -f
|
|
@itemx --filename=@var{PATH}
|
|
Read the template from the file in @var{PATH} instead of the command
|
|
line.
|
|
@end table
|
|
|
|
@node Invoking csv2rec
|
|
@section Invoking csv2rec
|
|
@cindex @command{csv2rec}
|
|
@cindex csv
|
|
@cindex comma separated values
|
|
|
|
@command{csv2rec} reads the given comma-separated-values file (or the
|
|
data from standard input if no file is specified) and prints out the
|
|
converted rec data, if possible. Synopsis:
|
|
|
|
@example
|
|
csv2rec [@var{option}]@dots{} [@var{csv_file}]
|
|
@end example
|
|
|
|
In addition to the common options described earlier (@pxref{Common
|
|
Options}) the program accepts the following options.
|
|
|
|
@table @samp
|
|
@item -t @var{type}
|
|
@itemx --type=@var{type}
|
|
Type of the converted records. If no type is specified then no type
|
|
is used.
|
|
@item -s
|
|
@itemx --strict
|
|
Be strict parsing the csv file.
|
|
@item -e
|
|
@itemx --omit-empty
|
|
Omit empty fields.
|
|
@end table
|
|
|
|
@node Invoking rec2csv
|
|
@section Invoking rec2csv
|
|
|
|
@cindex @command{rec2csv}
|
|
@cindex csv
|
|
@cindex comma separated values
|
|
|
|
@command{rec2csv} reads the given rec files (or the data in the
|
|
standard input if no file is specified) and prints out the converted
|
|
comma-separated-values. Synopsis:
|
|
|
|
@example
|
|
rec2csv [@var{option}]@dots{} [@var{rec_file}]@dots{}
|
|
@end example
|
|
|
|
The rec data can be read from files specified in the command line, or
|
|
from standard input. The program writes the converted data to
|
|
standard output.
|
|
|
|
In addition to the common options described earlier (@pxref{Common
|
|
Options}) the program accepts the following options.
|
|
|
|
@table @samp
|
|
@item -t @var{type}
|
|
@itemx --type=@var{type}
|
|
Type of the records to convert. If no type is specified then the
|
|
default records (with no name) are converted.
|
|
@item -S
|
|
@itemx --sort=@var{fields}
|
|
Sort the output by the comma-separated list of field names
|
|
@var{fields}. This option has precedence to whatever sorting criteria
|
|
are specified in the corresponding record descriptor with
|
|
@code{%sort}.
|
|
@item -d
|
|
@itemx --delim=@var{char}
|
|
Use @var{char} as the delimiter character separating fields in the
|
|
output. Defaults to @code{,}.
|
|
@end table
|
|
|
|
@node Invoking mdb2rec
|
|
@section Invoking mdb2rec
|
|
@cindex @command{mdb2rec}
|
|
@cindex mdb
|
|
@cindex MS Access
|
|
|
|
@command{mdb2rec} reads the given mdb file and prints out the
|
|
converted rec data, if possible. Synopsis:
|
|
|
|
@example
|
|
mdb2rec [@var{option}]@dots{} @var{mdb_file} [@var{table}]
|
|
@end example
|
|
|
|
All the tables contained in the mdb file are exported unless a table
|
|
is specified in the command line.
|
|
|
|
In addition to the common options described earlier (@pxref{Common
|
|
Options}) the program accepts the following options.
|
|
|
|
@table @samp
|
|
@item -s
|
|
@itemx --system-tables
|
|
Include system tables in the output.
|
|
@item -l
|
|
@itemx --list-tables
|
|
Dump a list of the table names contained in the mdb file, one per
|
|
line.
|
|
@item -e
|
|
@itemx --keep-empty-fields
|
|
Don't prune empty fields in the rec output.
|
|
@end table
|
|
|
|
@node Using ob-rec.el
|
|
@chapter Using ob-rec.el
|
|
|
|
ob-rec.el allows you to use Recutils as a language in org-mode source
|
|
blocks.
|
|
|
|
@section Setup
|
|
|
|
Recutils should install the necessary files where emacs can see them.
|
|
|
|
In your .emacs you may need to add:
|
|
@example
|
|
(require 'ob-rec)
|
|
@end example
|
|
|
|
You will need to add "rec" to your list of 'org-babel-load-languages' like
|
|
below:
|
|
@example
|
|
(org-babel-do-load-languages
|
|
'org-babel-load-languages
|
|
'((rec . t)))
|
|
@end example
|
|
|
|
@section Usage
|
|
|
|
To your org file, add a src code block like:
|
|
@example
|
|
#+BEGIN_SRC rec :data books.rec
|
|
Location = 'loaned'
|
|
#+END_SRC
|
|
@end example
|
|
|
|
This performs the equivalent of the command:
|
|
@example
|
|
$ recsel -e "Location = 'loaned'" books.rec
|
|
@end example
|
|
|
|
It will produce a result like:
|
|
@example
|
|
#+RESULTS:
|
|
| Title | Author | Date | Location |
|
|
|---------------------+-----------------+-----------------+----------|
|
|
| The Colour of Magic | Terry Pratchett | 4/20/01 11:15pm | loaned |
|
|
@end example
|
|
|
|
@section Header Arguments
|
|
|
|
@table @samp
|
|
@item :data
|
|
The recfile you would like to query. Can be a relative path. Spaces in
|
|
the filename or path need to be escaped with a backslash (for example,
|
|
file\ name.rec). This is the only required header argument.
|
|
|
|
@item :results
|
|
If this list contains "scalar", "html", "code" or "verbatim" then the
|
|
output will look the same as if called from the command line and it
|
|
will not be put into an org table.
|
|
|
|
@item :type
|
|
Only returns this type of record. Corresponds to the -t argument. Accepts
|
|
only one argument.
|
|
|
|
@item :fields
|
|
Comma-separated list of fields to print. Corresponds to the -p argument.
|
|
|
|
@item :sort
|
|
Comma-separated list of fields by which to sort records. Corresponds to
|
|
the -S argument.
|
|
|
|
@item :groupby
|
|
Comma-separated list of fields by which to group records. If the
|
|
records grouped together share fields in common, these will be in
|
|
separate columns with a "_N" appended. Corresponds to the -G argument.
|
|
|
|
@item :join
|
|
Field on which to join records from one record set to another. Please see
|
|
blah for more on how joins work. Corresponds to the -j argument.
|
|
@end table
|
|
|
|
@section Warnings
|
|
|
|
@enumerate
|
|
@item
|
|
Output may be unpredictable if fields contain newlines, as would be the case
|
|
for a multi-line field. This appears to be a limitation in org-mode's
|
|
'org-table-convert-region' function.
|
|
@end enumerate
|
|
|
|
@node Regular Expressions
|
|
@chapter Regular Expressions
|
|
|
|
@cindex regular expressions
|
|
The character @samp{.} matches any single character except the null character.
|
|
|
|
@table @samp
|
|
|
|
@item +
|
|
match one or more occurrences of the previous atom or regexp.
|
|
@item ?
|
|
match zero or one occurrences of the previous atom or regexp.
|
|
@item \+
|
|
matches a @samp{+}
|
|
@item \?
|
|
matches a @samp{?}.
|
|
@end table
|
|
|
|
Bracket expressions are used to match ranges of characters.
|
|
Bracket expressions where the range is backward, for example @samp{[z-a]}, are invalid.
|
|
Within square brackets, @samp{\} is taken literally.
|
|
Character classes are supported; for example @samp{[[:digit:]]} matches a single decimal digit.
|
|
|
|
GNU extensions are supported:
|
|
@table @samp
|
|
@item \w
|
|
matches a character within a word
|
|
@item \W
|
|
matches a character which is not within a word
|
|
@item \<
|
|
matches the beginning of a word
|
|
@item \>
|
|
matches the end of a word
|
|
@item \b
|
|
matches a word boundary
|
|
@item \B
|
|
matches characters which are not a word boundary
|
|
@item \`
|
|
matches the beginning of the whole input
|
|
@item \'
|
|
matches the end of the whole input
|
|
@end table
|
|
|
|
@cindex grouping, within regular expressions
|
|
Grouping is performed with parentheses @samp{()}. An unmatched
|
|
@samp{)} matches just itself. A backslash followed by a digit acts as
|
|
a back-reference and matches the same thing as the previous grouped
|
|
expression indicated by that number. For example, @samp{\2} matches
|
|
the second group expression. The order of group expressions is
|
|
determined by the position of their opening parenthesis @samp{(}.
|
|
|
|
The alternation operator is @samp{|}.
|
|
|
|
The characters @samp{^} and @samp{$} always represent the beginning
|
|
and end of a string respectively, except within square brackets.
|
|
Within brackets, an initial @samp{^} inverts the
|
|
character class being matched.
|
|
|
|
@samp{*}, @samp{+} and @samp{?} are special at any point in a regular
|
|
expression except the following places, where they are not allowed:
|
|
@enumerate
|
|
@item At the beginning of a regular expression
|
|
@item After an open-group, @samp{(}
|
|
@item After the alternation operator, @samp{|}
|
|
@end enumerate
|
|
|
|
Intervals are specified by @samp{@{} and @samp{@}}. Invalid intervals
|
|
such as @samp{a@{1z} are not accepted.
|
|
|
|
The longest possible match is returned; this applies to the regular
|
|
expression as a whole and (subject to this constraint) to
|
|
sub-expressions within groups.
|
|
|
|
@c @lowersections
|
|
@include parse-datetime.texi
|
|
@c @raisesections
|
|
|
|
@node GNU Free Documentation License
|
|
@appendix GNU Free Documentation License
|
|
@cindex license, GNU Free Documentation License
|
|
|
|
@include fdl.texi
|
|
|
|
|
|
@node Concept Index
|
|
@unnumbered Concept Index
|
|
|
|
@printindex cp
|
|
|
|
|
|
@bye
|