mammos_entity.io#

Support for reading and writing Entity files.

mammos_entity.io can write and read data in CSV and YAML format.

CSV#

CSV files written by mammos_entity.io contain data in normal CSV format and additional commented metadata lines at the top of the file. Comment lines start with #, inline comments are not allowed.

The lines are, in order:

  • (Commented) the file version in the form mammos csv v<VERSION> The reading code checks the version number (using regex v\d+) to ensure compatibility.

  • (Commented, optional) a description of the file if given. It will appear delimited by dashed lines. It is meant to be human readable and is ignored by reading routines in mammos_entity.io.

  • (Commented) the preferred ontology label.

  • (Commented) the ontology IRI.

  • (Commented) units.

  • The short labels used to refer to individual columns when working with the data, e.g. in a pandas.DataFrame. Omitting spaces in this string is advisable. Ideally this string is the short ontology label.

  • All remaining lines contain data.

Elements in a line are separated by a comma without any surrounding whitespace. A trailing comma is not permitted.

In columns without ontology the lines containing labels and IRIs are empty.

Similarly, columns without units (with or without ontology entry) have empty units line.

Added in version v2: The optional description of the file.

Example

Here is an example with five columns:

  • an index with no units or ontology label

  • the entity spontaneous magnetization with an entry in the ontology

  • a made-up quantity alpha with a unit but no ontology label

  • demagnetizing factor with an ontology entry but no unit

  • a column description containing a string description without units or ontology label

The file has a description reading “Test data”.

>>> from pathlib import Path
>>> import mammos_entity as me
>>> import mammos_units as u
>>> me.io.entities_to_file(
...     "example.csv",
...     "Test data",
...     index=[0, 1, 2],
...     Ms=me.Ms([1e2, 1e2, 1e2], "kA/m"),
...     alpha=[1.2, 3.4, 5.6] * u.s**2,
...     DemagnetizingFactor=me.Entity("DemagnetizingFactor", [1, 0.5, 0.5]),
...     description=[
...         "Description of the first data row",
...         "Description of the second data row",
...         "Description of the third data row",
...     ],
... )

The new file has the following content:

>>> print(Path("example.csv").read_text())
#mammos csv v2
#----------------------------------------
# Test data
#----------------------------------------
#,SpontaneousMagnetization,,DemagnetizingFactor,
#,https://w3id.org/emmo/domain/magnetic_material#EMMO_032731f8-874d-5efb-9c9d-6dafaa17ef25,,https://w3id.org/emmo/domain/magnetic_material#EMMO_0f2b5cc9-d00a-5030-8448-99ba6b7dfd1e,
#,kA / m,s2,,
index,Ms,alpha,DemagnetizingFactor,description
0,100.0,1.2,1.0,Description of the first data row
1,100.0,3.4,0.5,Description of the second data row
2,100.0,5.6,0.5,Description of the third data row

Finally, remove the file.

>>> Path("example.csv").unlink()

YAML#

YAML files written by mammos_entity.io have the following format:

  • They have two top-level keys metadata and data.

  • metadata contains keys

    • version: a string that matches the regex v\d+

    • description: a (multi-line) string with arbitrary content

  • data contains on key per object saved in the file. Each object has the keys:

    • ontology_label: label in the ontology, null if the element is no Entity.

    • ontology_iri: IRI of the entity, null if the element is no Entity.

    • unit: unit of the entity or quantity, null if the element has no unit, empty string for dimensionless quantities and entities.

    • value: value of the data.

Example

Here is an example with six entries:

  • an index with no units or ontology label

  • the entity spontaneous magnetization with an entry in the ontology

  • a made-up quantity alpha with a unit but no ontology label

  • demagnetizing factor with an ontology entry but no unit

  • a column description containing a string description without units or ontology label

  • an element Tc with only a single value

The file has a description reading “Test data”.

>>> from pathlib import Path
>>> import mammos_entity as me
>>> import mammos_units as u
>>> me.io.entities_to_file(
...     "example.yaml",
...     "Test data",
...     index=[0, 1, 2],
...     Ms=me.Ms([1e2, 1e2, 1e2], "kA/m"),
...     alpha=[1.2, 3.4, 5.6] * u.s**2,
...     DemagnetizingFactor=me.Entity("DemagnetizingFactor", [1, 0.5, 0.5]),
...     description=[
...         "Description of the first data row",
...         "Description of the second data row",
...         "Description of the third data row",
...     ],
...     Tc=me.Tc(300, "K"),
... )

The new file has the following content:

>>> print(Path("example.yaml").read_text())
metadata:
  version: v1
  description: Test data
data:
  index:
    ontology_label: null
    ontology_iri: null
    unit: null
    value: [0, 1, 2]
  Ms:
    ontology_label: SpontaneousMagnetization
    ontology_iri: https://w3id.org/emmo/domain/magnetic_material#EMMO_032731f8-874d-5efb-9c9d-6dafaa17ef25
    unit: kA / m
    value: [100.0, 100.0, 100.0]
  alpha:
    ontology_label: null
    ontology_iri: null
    unit: s2
    value: [1.2, 3.4, 5.6]
  DemagnetizingFactor:
    ontology_label: DemagnetizingFactor
    ontology_iri: https://w3id.org/emmo/domain/magnetic_material#EMMO_0f2b5cc9-d00a-5030-8448-99ba6b7dfd1e
    unit: ''
    value: [1.0, 0.5, 0.5]
  description:
    ontology_label: null
    ontology_iri: null
    unit: null
    value: [Description of the first data row, Description of the second data row,
      Description of the third data row]
  Tc:
    ontology_label: CurieTemperature
    ontology_iri: https://w3id.org/emmo#EMMO_6b5af5a8_a2d8_4353_a1d6_54c9f778343d
    unit: K
    value: 300.0

Finally, remove the file.

>>> Path("example.yaml").unlink()

Functions

entities_from_csv(filename)

Deprecated: read CSV file with ontology metadata, use entities_from_file.

entities_from_file(filename)

Read files with ontology metadata.

entities_to_csv(_filename[, _description])

Deprecated: write tabular data to csv file, use entities_to_file.

entities_to_file(_filename[, _description])

Write entity data to file.

Classes

EntityCollection(**kwargs)

Container class storing entity-like objects.

class mammos_entity.io.EntityCollection(**kwargs)[source]#

Container class storing entity-like objects.

Initialize EntityCollection, keywords become attributes of the class.

to_dataframe(include_units=True)[source]#

Convert values to dataframe.

Parameters:

include_units (bool)

mammos_entity.io.entities_from_file(filename)[source]#

Read files with ontology metadata.

Reads a file as defined in the module description. The returned container provides access to the individual entities.

Parameters:

filename (str | Path) – Name or path of file to read. The file extension is used to determine the file type.

Returns:

A container object providing access all entities from the file.

Return type:

EntityCollection

mammos_entity.io.entities_to_file(_filename, _description=None, /, **entities)[source]#

Write entity data to file.

Supported file formats:

  • CSV

  • YAML

The file format is inferred from the filename suffix:

  • .csv is written as CSV

  • .yaml and .yml are written as YAML

The file structure is explained in the module-level documentation.

The arguments _filename and _description are named in such a way that an user could define entities named filename and description. They are furthermore defined as positional only arguments.

Parameters:
  • _filename (str | Path) – Name or path of file where to store data.

  • _description (str | None) – Optional description of data. If given, it will appear in the metadata part of the file.

  • **entities (mammos_entity.Entity | astropy.units.Quantity | numpy.typing.ArrayLike) – Data to be saved to file. For CSV all entity like objects need to have the same length and shape 0 or 1, YAML supports different lengths and arbitrary shape.

Return type:

None