|
PEA file
format specifications 1.0
Pea, acronym for
Pack, Encrypt, Authenticate, designs a file format aiming to provide archiving, compression and multi volume file
split feature in a single passage,
along with
flexible schemes of integrity check
and authenticated
encryption; PEA file format specifications are released under
public domain.
PEA security model
acts
at 3 levels: objects, streams and volumes; each one of those levels can
be omitted as needed by the user:
- Object level integrity checking is performed to
detect errors with object level granularity on raw input data and all
associated data (name, size, attributes, datetime);
- Stream level check offers wide choice of
algorithms
up to authenticated encryption, protecting privacy and authenticity of
a group of objects sharing same security needs, including tags
generated by object level checks;
- Volume level integrity check is communication
oriented and allow to discard single corrupted volumes in order to
minimize, in case of error, the retransmission overhead;
Arbitrarily sized
volume
spanning allows the archive to be splitted in volumes of arbitrary
size, with the only constrain of volumes being at least 10 byte bigger
than volume control tag to allow passing (through archive's header)
minimum needed information to the extraction application.
PEA file format as defined in version 1 revision 0 specification can
store a single stream containing unlimited objects each up to 2^64 byte
in size; current Pea archiver utility supports 1.0 file format
specifications (practically, archives are memory and filesystem-limited
rather than format limited).
PEA 2.0 file format specifications extend the concepts behind PEA 1.0
file format and can store an unlimited number of stream, but the format
is not actually supported by current Pea archiving utility.
For an exaustive explanation and discussion of the format
specifications please see the documentation about Pea archive format (.pdf)
Pea executable is the engine implementing PEA
file format
archiving and
extraction; it is released as open source freeware program under
LGPLv3.
Pea archiving and file encryption utility can be compiled as an
autocontained executable
that can be used from batch scripts or invoked by a GUI frontend like
PeaZip or it can be used also as library from external application; it
supports it's native .pea archive format and also raw file split / join
with optional integrity check ranging to CRCs to cryptographically
strong hashes.
Compression
Data is compressed
with a deflate
based compression scheme (PCOMPRESS) defined in PEA file format
specifications, resulting in compression ratio and compression speed
similar to typical compressor of that class, like Zip, PKZip, GZip.
Please note that at present stage of development of PEA format maximum
compression is out of the scope of the project; a fast and versatile
deflate-based compression scheme was used in order to offer a
reasonable tradeoff between compression ratio and speed to make PEA
suitable for most uses.
For a standardised compression benchmark, including PeaZip's
compression ratio using PEA format, you can look at Matt Mahoney's Large
Text Compression Benchmark
Integrity check algorithms
Checksum /CRC, hash and
encryption algorithms are provided by Wolfgang Ehrhardt's Pascal/Delphi
crypto library (under zlib license).
|
Purpose
|
Algorithms
|
|
Pea: Object, Stream and Volume level; Raw file join/split
optional integrity check
|
|
Checksum/CRC
|
Adler32, CRC32, CRC64
|
|
Hash
|
MD5, SHA1, RIPEMD-160,
SHA256,
SHA512, Whirlpool
|
|
Pea: Stream level only
|
|
Authenticated encryption
|
AES128 HMAC, AES128 EAX, AES256 EAX
|
File split
and join
Pea's file split and join features are compatible with most file split
applications; just provide an input file to split it to the desired
output size.
Optionally Pea can save a control file containing checksum or hash (see
Control algorithm paragraph) of each volume and of the original file,
allowing file level and volume level integrity check, which will be
ignored by other file split utilities.
Merging back the split file, Pea will check this control file and give
a simple warning if it's not found, i.e. because the file was split by
other file split application, or cast an error message if not matching.
Other features
Pea executable uses the implemented features (encryption,
checksum/hash, randomness sampling etc...) for other general utility
tasks, not necessarily related with file archiving, providing also:
- secure data deletion
- file checksum/hash utility
- byte to byte file compare utility
You don't need a separate download for getting Pea archiving utility's
executable, sources and documentation, since are part of the respective
PeaZip packages.
Here, a brief table
of
features and limitations applying to file format and to current
implementation:
|
Feature
|
PEA
file format
|
Current
implementation
|
|
Archive
|
|
Max
archive size
|
unlimited
|
up to 999999 volumes of
2^64-1
byte each; using 128 bit block encryption it would be safe not to
encrypt more than 2^64 byte with same key, better staying one or more
orders of magnitude below
|
|
Stream
number
|
1.0: single stream;
2.0 unlimited number of
streams;
|
Single stream (1.0 file
format)
|
|
Output
|
|
Security
|
Optional Authenticated
Encryption, at stream level only.
|
|
Integrity
check
|
AE tag or hash or checksum
at
stream level, hash or checksum for input objects and output volumes
|
|
Error
correction
|
No scheme featured
|
|
Communication
recovery
|
Independent volume control
check
allow to identify corrupted volumes (first volume may be needed to know
volume check algorithm)
|
No specific tool
developed;
volume check is done during extraction and then, allowing to repeat
download only of corrupted volumes
|
|
Data
recovery
|
Stream control tags allow
to
recognize correct streams, if better granularity is needed object
control tags allow to recognize correct objects; input object names and
POD trigger allow to identify objects and stream between the archive
data;
|
No specific tool developed
to
try error resistant data extraction, however object check errors are
reported to identify corrupted and non corrupted data if the extraction
is successful
|
|
Support
for multi volume output
|
Native, requires a single
pass
|
|
Volume
number
|
1..unlimited
|
1..999999 (counter in
output
name)
|
|
Volume
size
|
Volume tag size +1..
unlimited;
first volume must contain at least 10 byte of data to allow parsing of
the archive header, to allow unpacking application to calculate
volume tag size
|
Volume tag size +1..
2^64-1
(qword variable) ; first volume must contain at least 10 byte of data
|
|
Compression
|
Native, requires single
pass;
schemes:
PCOMPRESS0: no
compression;
PCOMPRESS1..3 based on deflate using zlib's compres/uncompres, level 3,
6 and 9 respectively
|
|
Solid
archive
|
Not implemented
compression
modes featuring the possibility of creating solid archive
|
|
Input
|
|
Input
types
|
1.0: files and dirs;
2.0: files, dirs, metadata
stored as messages triggers
|
Files and dirs (1.0)
|
|
Input
objects number
|
1..unlimited
|
Host system memory limited
(input object list is stored in a dynamic array of strings)
|
|
Input
object size (of single objects)
|
0..2^64-1
|
0..2^64-1
|
|
Input
object qualified name size (size 0 mean that archive object is a
trigger, no input object mapped to the archive object)
|
1..2^16-1
|
1..32K (exceeding needs,
longer
values are considered errors)
|
|
Metadata
|
Objects attributes and
last
modification time, optionally comments and any kind of meta content
using messages
|
Save object attributes and
object last modification time. Restore only object attributes (on
Windows), nothing on *x
|
Third
parts technologies
Pea uses Wolfgang Ehrhardt's Pascal/Delphi
crypto library
Software
supporting PEA archives
|