fastgedcom.base

Classes and types for the data structure used to represent a gedcom.

Attributes

SubmRef

The cross-reference identifier of type '@SUB1@' or '@U1@' for a submitter

SubnRef

Deprecated. The cross-reference identifier of type '@SUB2@' for a submission.

IndiRef

The cross-reference identifier of type '@I1@' for an individual.

FamRef

The cross-reference identifier of type '@F1@' for a family.

SNoteRef

The cross-reference identifier of type '@N1@' for a shared note.

SourRef

The cross-reference identifier of type '@S1@' for a source document.

RepoRef

The cross-reference identifier of type '@R1@' for a repository (an archive).

ObjeRef

The cross-reference identifier of type '@O1@' for an object (e.g. an image).

XRef

The cross-reference identifier indicates a record to which payloads may point.

VoidRef

A pointer used for unknown value where payload can't be let empty.

Pointer

Generic pointer that is used in the payload to reference an existing record

Record

A level 0 line referenced by an XRef in the document.

fake_line

FakeLine instance returned by functions.

Classes

Line

Abstract base class for gedcom lines.

FakeLine

Dummy line for syntactic sugar.

TrueLine

Represent a line of a gedcom document.

Document

Store all the information of the gedcom document.

Module Contents

fastgedcom.base.SubmRef: TypeAlias = str[source]

The cross-reference identifier of type ‘@SUB1@’ or ‘@U1@’ for a submitter of the document.

fastgedcom.base.SubnRef: TypeAlias = str[source]

Deprecated. The cross-reference identifier of type ‘@SUB2@’ for a submission.

fastgedcom.base.IndiRef: TypeAlias = str[source]

The cross-reference identifier of type ‘@I1@’ for an individual.

fastgedcom.base.FamRef: TypeAlias = str[source]

The cross-reference identifier of type ‘@F1@’ for a family.

fastgedcom.base.SNoteRef: TypeAlias = str[source]

The cross-reference identifier of type ‘@N1@’ for a shared note.

fastgedcom.base.SourRef: TypeAlias = str[source]

The cross-reference identifier of type ‘@S1@’ for a source document.

fastgedcom.base.RepoRef: TypeAlias = str[source]

The cross-reference identifier of type ‘@R1@’ for a repository (an archive).

fastgedcom.base.ObjeRef: TypeAlias = str[source]

The cross-reference identifier of type ‘@O1@’ for an object (e.g. an image).

fastgedcom.base.XRef: TypeAlias = SubmRef | SubnRef | IndiRef | FamRef | SNoteRef | SourRef | RepoRef | ObjeRef[source]

The cross-reference identifier indicates a record to which payloads may point.

fastgedcom.base.VoidRef: TypeAlias = Literal['@VOID@'][source]

A pointer used for unknown value where payload can’t be let empty.

e.g.: In a family record, the line ‘2 CHIL @VOID@’ indicates that the parents had a child whom we know nothing. The line is used to keep the children birth order.

fastgedcom.base.Pointer: TypeAlias = XRef | VoidRef[source]

Generic pointer that is used in the payload to reference an existing record or a non-existing one.

class fastgedcom.base.Line[source]

Bases: abc.ABC

Abstract base class for gedcom lines.

Implementations are TrueLine and FakeLine, see these classes for more information.

abstract __bool__() bool[source]

True if it is a TrueLine, False if it is a FakeLine.

property payload: str[source]
Abstractmethod:

See the description of TrueLine class.

property payload_with_cont: str[source]
Abstractmethod:

Return the multi-line payload into a single string.

Multi-line payloads are split into several Line as written in the original gedcom file. The corresponding sub-lines are with the tags CONC and CONT. There are gathered into a single string by concatenation of the different payload of each line. A newline is added for the concatenation of sub-lines with the CONT tag.

property sub_lines: list[TrueLine][source]
Abstractmethod:

See the description of TrueLine class.

__iter__() Iterator[TrueLine][source]

Iterate on sub-lines, i.e. the next-level lines that are part of this structure.

abstract get_sub_lines(tag: str) list[TrueLine][source]

Return all sub-lines having the given tag. Return an empty list if no line matches.

__rshift__(tag: str) list[TrueLine][source]

Alias for get_sub_lines() to shorten the syntax by using the >> operator.

abstract get_sub_line(tag: str) TrueLine | FakeLine[source]

Return the first sub-line having the given tag. Return a FakeLine if no line matches.

__gt__(tag: str) TrueLine | FakeLine[source]

Alias for get_sub_line() to shorten the syntax by using the > operator.

abstract get_sub_line_payload(tag: str) str[source]

Return the payload of the first sub-line having the given tag. Return an empty string if no line matches.

__ge__(tag: str) str[source]

Alias for get_sub_line_payload() to shorten the syntax by using the >= operator.

get_all_sub_lines() Iterator[TrueLine][source]

Recursively iterate on sub-lines. All lines under the given line are returned. The order is preserved as in the gedcom file, sub-sub-lines come before siblings lines.

get_source() str[source]

Return the gedcom text equivalent for the line and its sub-lines.

class fastgedcom.base.FakeLine[source]

Bases: Line

Dummy line for syntactic sugar.

It allows the chaining of method calls. See these examples for the usage of chaining.

The class behave like a TrueLine (It has the same methods), but the payload is empty.

To differentiate a FakeLine from a TrueLine a simple boolean test is enough.

payload = ''[source]

See the description of TrueLine class.

payload_with_cont = ''[source]

Return the multi-line payload into a single string.

Multi-line payloads are split into several Line as written in the original gedcom file. The corresponding sub-lines are with the tags CONC and CONT. There are gathered into a single string by concatenation of the different payload of each line. A newline is added for the concatenation of sub-lines with the CONT tag.

sub_lines = [][source]

See the description of TrueLine class.

__bool__() Literal[False][source]

Return False.

get_sub_lines(tag: str) list[TrueLine][source]

Return all sub-lines having the given tag. Return an empty list if no line matches.

__rshift__(tag: str) list[TrueLine][source]

Alias for get_sub_lines() to shorten the syntax by using the >> operator.

get_sub_line(tag: str) TrueLine | FakeLine[source]

Return the first sub-line having the given tag. Return a FakeLine if no line matches.

__gt__(tag: str) TrueLine | FakeLine[source]

Alias for get_sub_line() to shorten the syntax by using the > operator.

get_sub_line_payload(tag: str) str[source]

Return the payload of the first sub-line having the given tag. Return an empty string if no line matches.

__ge__(tag: str) str[source]

Alias for get_sub_line_payload() to shorten the syntax by using the >= operator.

__repr__() str[source]

Return the string representation of the class.

__eq__(value: object) bool[source]
class fastgedcom.base.TrueLine[source]

Bases: Line

Represent a line of a gedcom document.

Contain the sub-lines of the gedcom structure to form a recursive representation of the gedcom file.

This class uses the simplified format, instead of the normalized Level [Xref] Tag [LineVal] format.

The format of a gedcom line: Level Tag Payload.

In the simplified format, the tag is either the normalized Tag or the optional Xref. Hence, the payload is the LineVal - when the Xref is not present - or the normalized Tag plus the LineVal (generally an empty string) - when the Xref is present. The Payload can be an empty string. As for the level, it matches the definition of the gedcom standard.

level: int[source]

The line level defined by the gedcom standard.

tag: str | XRef[source]

The cross-reference identifier for level 0 line (also called record identifier), or the tag defining the information and the structure of the data.

payload: str = ''[source]

The payload of the structure, also called content or value.

Warning: Multi-line payloads are split into several Line as written in the original gedcom file. The corresponding sub-lines are with the tags CONC and CONT. Use the payload_with_cont property to get the complete multi-line payloads.

sub_lines: list[TrueLine][source]

List of the sub-lines, i.e. the next-level lines that are part of this structure.

__bool__() Literal[True][source]

Return True.

get_sub_lines(tag: str) list[TrueLine][source]

Return all sub-lines having the given tag. Return an empty list if no line matches.

__rshift__(tag: str) list[TrueLine][source]

Alias for get_sub_lines() to shorten the syntax by using the >> operator.

get_sub_line(tag: str) TrueLine | FakeLine[source]

Return the first sub-line having the given tag. Return a FakeLine if no line matches.

__gt__(tag: str) TrueLine | FakeLine[source]

Alias for get_sub_line() to shorten the syntax by using the > operator.

get_sub_line_payload(tag: str) str[source]

Return the payload of the first sub-line having the given tag. Return an empty string if no line matches.

__ge__(tag: str) str[source]

Alias for get_sub_line_payload() to shorten the syntax by using the >= operator.

__str__() str[source]

Return the gedcom representation of the line (sub-lines excluded).

__repr__() str[source]

Return the string representation of the class.

property payload_with_cont: str[source]

Return the multi-line payload into a single string.

Multi-line payloads are split into several Line as written in the original gedcom file. The corresponding sub-lines are with the tags CONC and CONT. There are gathered into a single string by concatenation of the different payload of each line. A newline is added for the concatenation of sub-lines with the CONT tag.

fastgedcom.base.Record: TypeAlias = TrueLine[source]

A level 0 line referenced by an XRef in the document.

class fastgedcom.base.Document[source]

Store all the information of the gedcom document.

All records (level 0 lines) are directly accessible via the records dictionnary and the other lines are accessible via TrueLine.sub_lines.

records: dict[XRef, Record][source]

Dictionnary of records, accessible via get_records() or __getitem__(). Access it directly to raise KeyError instead of getting a FakeLine. Usefull when you a pretty sure of the Record existing in the document.

__iter__() Iterator[Record][source]

Iterate on the lines of level 0: the records, the header, and the TRLR line.

__contains__(identifier: XRef) bool[source]

Return True if the identifier refers to an existing record.

get_records(record_type: str) Iterator[Record][source]

Return an iterator over records of that record_type. The type is the payload of level 0 lines: INDI, FAM, etc..

__rshift__[source]

Alias for get_records() to shorten the syntax by using the >> operator.

get_record(identifier: XRef | Literal['HEAD']) Record | FakeLine[source]

Return the record under that identifier.

__getitem__[source]

Alias for get_record() to shorten the syntax by using the [] operator.

all_lines() Iterator[list[TrueLine]][source]

Return an iterator over all lines of the document. An element of the iterator is the sequence of lines to access the last line of the list.

For example, given the following gedcom document:

0 @I1@ INDI
1 NAME John /Doe/
2 SURN Doe
0 @I2@ INDI
>>> list(document.all_lines())
[
    [<TrueLine 0 @I1@ INDI -> 1>],
    [<TrueLine 0 @I1@ INDI -> 1>, <TrueLine 1 NAME John /Doe/ -> 1>],
    [<TrueLine 0 @I1@ INDI -> 1>, <TrueLine 1 NAME John /Doe/ -> 1>, <TrueLine 2 SURN Doe -> 0>],
    [<TrueLine 0 @I2@ INDI -> 0>],
]
__eq__(__value: object) bool[source]
get_source() str[source]

Return the gedcom text equivalent for the Document into a string. Usefull to save a modified Document into a file.

fastgedcom.base.fake_line[source]

FakeLine instance returned by functions. Used to avoid having multiple unnecessary instances of FakeLine.