Simulations

The LiteBIRD Simulation Framework is built on the Simulation class, which should be instantiated in any pipeline built using this framework. The class acts as a container for the many analysis modules available to the user, and it offers the following features:

  1. Provenance model;

  2. Interface with the instrument database;

  3. System abstractions;

  4. Generation of reports;

  5. Printing status messages on the terminal (logging).

Provenance model

A «provenance model» is, generally speaking, a way to track the history and origin of a data set by recording the following information:

  1. Who or what created the dataset?

  2. Which algorithm or instrumentation was used to produce it?

  3. Which steps were undertaken to process the raw data?

  4. How can one get access to the raw samples used to produce the dataset?

The LiteBIRD Simulation Framework tracks these information using parameter files (in TOML format) and generating reports at the end of a simulation.

Parameter files

When you run a simulation, there are typically plenty of parameters that need to be passed to the code: the resolution of an output map, the names of the detectors to simulate, whether to include synchrotron emission in the sky model, etc.

The Simulation class eases this task by accepting the path to a TOML file as a parameter (parameter_file). Specifying this parameter triggers two actions:

  1. The file is copied to the output directory where the simulation output files are going to be written;

  2. The file is read and made available in the field parameters (a Python dictionary).

The parameter is optional; if you do not specify parameter_file when creating a Simulation object, the parameters field will be set to an empty dictionary. (You can even directly pass a dictionary to a Simulation object: this can be handy if you already constructed a parameter object somewhere else.)

Take this example of a simple TOML file:

# This is file "my_conf.toml"
[general]
nside = 512
imo_version = "v0.10"

[sky_model]
components = ["synchrotron", "dust", "cmb"]

The following example loads the TOML file and prints its contents to the terminal:

import litebird_sim as lbs

sim = lbs.Simulation(parameter_file="my_conf.toml")

print("NSIDE =", sim.parameters["general"]["nside"])
print("The IMO I'm going to use is",
      sim.parameters["general"]["imo_version"])

print("Here are the sky components I'm going to simulate:")
for component in sim.parameters["sky_model"]["components"]:
    print("-", component)

The output of the script is the following:

NSIDE = 512
The IMO I'm going to use is v0.10
Here are the sky components I'm going to simulate:
- synchrotron
- dust
- cmb

A Simulation object only interprets the section simulation and leaves everything else unevaluated: it’s up to the simulation modules to make sense of any other section. The recognized parameters in the section named simulation are the following:

  • base_path: a string containing the path where to save the results of the simulation.

  • start_time: the start time of the simulation. If it is a string or a TOML datetime, it will be passed to the constructor for astropy.time.Time, otherwise it must be a floating-point value.

  • duration_s: a floating-point number specifying how many seconds the simulation should last. You can pass a string, which can contain a measurement unit as well: in this case, you are not forced to specify the duration in seconds. Valid units are: days (or day), hours (or hour), min, and sec (or s).

  • name: a string containing the name of the simulation.

  • description: a string containing a (possibly long) description of what the simulation does.

These parameters can be used instead of the keywords in the constructor of the Simulation class. Consider the following code:

sim = Simulation(
    base_path="/storage/output",
    start_time=astropy.time.Time("2020-02-01T10:30:00"),
    duration_s=3600.0,
    name="My simulation",
    description="A long description should be put here")
)

You can achieve the same if you create a TOML file named foo.toml and containing the following lines:

[simulation]
base_path = "/storage/output"
start_time = 2020-02-01T10:30:00
duration_s = 3600.0
name = "My simulation"
description = "A long description should be put here"

and then you initialize the sim variable in your Python code as follows:

sim = Simulation(parameter_file="foo.toml")

You would achieve identical results if you specify the duration in one of the following ways:

# All of these are the same
duration_s = "1 hour"
duration_s = "60 min"
duration_s = "3600 s"

Interface with the instrument database

To simulation LiteBIRD’s data acquisition, the simulation code must be aware of the characteristics of the instrument. These are specified in the LiteBIRD Instrument Model (IMO) database, which can be accessed by people with sufficient rights. This Simulation Framework has the ability to access the database and take the input parameters necessary for its analysis modules to produce the expected output.

System abstractions

In some cases, simulations must be ran on HPC computers, distributing the job on many processing units; in other cases, a simple laptop might be enough. The LiteBIRD Simulation Framework uses MPI to parallelize its codes, which is however an optional dependency: the code can be ran serially.

When creating a Simulation object, the user can tell the framework to use or not MPI using the flag use_mpi:

import litebird_sim as lbs

# This simulation must be ran using MPI
sim = lbs.Simulation(use_mpi = True)

The framework sets a number of variables related to MPI; these variables are always defined, even if MPI is not available, and they can be used to make the code work in different situations. If your code must be able to run both with and without MPI, you should initialize a Simulation object using the variable MPI_ENABLED:

import litebird_sim as lbs

# This simulation can take advantage of MPI, if present
sim = lbs.Simulation(use_mpi = lbs.MPI_ENABLED)

See the page Using MPI for more information.

Generation of reports

This section should explain how reports can be generated, first from the perspective of a library user, and then describing how developers can generate plots for their own modules.

Here is an example, showing several advanced topics like mathematical formulae, plots, and value substitution:

import litebird_sim as lbs
import matplotlib.pylab as plt

sim = lbs.Simulation(name="My simulation", base_path="output")
data_points = [0, 1, 2, 3]

plt.plot(data_points)
fig = plt.gcf()

sim.append_to_report('''
Here is a formula for $`f(x)`$:

```math
f(x) = \sin x
```

And here is a completely unrelated plot:

![](myplot.png)

The data points have the following values:
{% for sample in data_points %}
- {{ sample }}
{% endfor %}
''', figures=[(fig, "myplot.png")],
     data_points=data_points)

sim.flush()

And here is the output, which is saved in output/report.html:

_images/report_example.png

Logging

The report generation tools described above are useful to produce a synthetic report of the scientific outcomes of a simulation. However, one often wants to monitor the execution of the code in a more detailed manner, checking which functions have been called, how often, etc. In this case, the best option is to write messages to the terminal. Python provides the logging module for this purpose, and when you initialize a Simulation object, the module is initialize with a set of sensible defaults. In your code you can use the functions debug, info, warning, error, and critical to monitor what’s happening during execution:

import litebird_sim as lbs
import logging as log       # "log" is shorter to write
my_sim = lbs.Simulation()
log.info("the simulation starts here!")
pi = 3.15
if pi != 3.14:
    log.error("wrong value of pi!")

The output of the code above is the following:

[2020-07-18 06:25:27,653 INFO] the simulation starts here!
[2020-07-18 06:25:27,653 ERROR] wrong value of pi!

Note that the messages are prepended with the date, time, and level of severity of the message.

A few environment variables can taylor the way logging is done:

  • LOG_DEBUG: by default, debug messages are not printed to the terminal, because they are often too verbose for typical uses. If you want to debug your code, set a non-empty value to this variable.

  • LOG_ALL_MPI: by default, if you are using MPI then only messages from the process running with rank 0 will be printed. Setting this environment variable will make all the processes print their message to the terminal. (Caution: there might be overlapping messages, if two processes happen to write at the same time.)

The way you use these variable from the terminal is illustrated with an example. Suppose that we changed our example above, so that log.debug is called instead of log.info:

import litebird_sim as lbs
import logging as log  # "log" is shorter to write

my_sim = lbs.Simulation()
log.debug("the simulation starts here!")
pi = 3.15
if pi != 3.14:
    log.debug("wrong value of pi!")

In this case, running the script will produce no messages, as the default is to skip log.debug calls:

$ poetry run python my_script.py
$

However, running the script with the environment variable LOG_DEBUG set will make the messages appear:

$ LOG_DEBUG=1 poetry run python my_script.py  # No logging
[2020-07-18 06:31:03,223 DEBUG] the simulation starts here!
[2020-07-18 06:31:03,224 DEBUG] wrong value of pi!
$

API reference

class litebird_sim.simulations.OutputFileRecord(path, description)

Bases: tuple

property description

Alias for field number 1

property path

Alias for field number 0

class litebird_sim.simulations.Simulation(base_path=None, name=None, mpi_comm=<litebird_sim.mpi._SerialMpiCommunicator object>, description='', start_time=None, duration_s=None, imo=None, parameter_file=None, parameters=None)

Bases: object

A container object for running simulations

This is the most important class in the Litebird_sim framework. It initializes an output directory that will contain all the products of a simulation and will handle the generation of reports and writing of output files.

Be sure to call Simulation.flush() when the simulation is completed. This ensures that all the information are saved to disk before the completion of your script.

You can access the fields base_path, name, mpi_comm, and description in the Simulation object:

sim = litebird_sim.Simulation(name="My simulation")
print(f"Running {sim.name}, saving results in {sim.base_path}")

The member variable observations is a list of Observation objects, which is initialized by the methods create_observations() (when distribute=True) and distribute_workload().

This class keeps track of any output file saved in base_path through the member variable self.list_of_outputs. This is a list of objects of type OutputFileRecord(), which are 2-tuples of the form (path, description), where path is a pathlib.Path object and description is a str object:

for curpath, curdescr in sim.list_of_outputs:
    print(f"{curpath}: {curdescr}")

When pointing information is needed, you can call the method Simulation.generate_spin2ecl_quaternions(), which initializes the members pointing_freq_hz and spin2ecliptic_quats; these members are used by functions like get_pointings().

Parameters
  • base_path (str or pathlib.Path) – the folder that will contain the output. If this folder does not exist and the user has sufficient rights, it will be created.

  • name (str) – a string identifying the simulation. This will be used in the reports.

  • mpi_comm – either None (do not use MPI) or a MPI communicator object, like mpi4py.MPI.COMM_WORLD.

  • description (str) – a (possibly long) description of the simulation, to be put in the report saved in base_path).

  • start_time (float or astropy.time.Time) – the start time of the simulation. It can be either an arbitrary floating-point number (e.g., 0) or an astropy.time.Time instance; in the latter case, this triggers a more precise (and slower) computation of pointing information.

  • duration_s (float) – Number of seconds the simulation should last.

  • imo (Imo) – an instance of the Imo class

  • parameter_file (str or pathlib.Path) – path to a TOML file that contains the parameters for the simulation. This file will be copied into base_path, and its contents will be read into the field parameters (a Python dictionary).

append_to_report(markdown_text: str, append_newline=True, figures: List[Tuple[Any, str]] = [], **kwargs)

Append text and figures to the simulation report

Parameters
  • markdown_text (str) – text to be appended to the report.

  • append_newline (bool) – append newlines to the end of the text. This ensures that calling again this method will produce a separate paragraph.

  • figures (list of 2-tuples) – list of Matplotlib figures to be saved in the report. Each tuple must contain one figure and one filename. The figures will be saved using the specified file name in the output directory. The file name must match the one used as reference in the Markdown text.

  • kwargs – any other keyword argument will be used to expand the text markdown_text using the Jinja2 library library.

A Simulation class can generate reports in Markdown format. Use this function to add some text to the report, possibly including figures. The function has no effect if called from an MPI rank different from #0.

It is possible to use objects other than Matplotlib figures. The only method this function calls is savefig, with no arguments.

Images are saved immediately during the call, but the text will be written to disk only when flush() is called.

You can put LaTeX formulae in the text, using $`...`$ for inline equations and the math tag in fenced text for displayed equations.

create_observations(detectors: List[litebird_sim.detectors.DetectorInfo], num_of_obs_per_detector: int = 1, distribute=True, n_blocks_det=1, n_blocks_time=1, root=0, dtype_tod=<class 'numpy.float32'>)

Create a set of Observation objects

distribute_workload(observations: List[litebird_sim.observations.Observation])
flush(include_git_diff=True)

Terminate a simulation.

This function must be called when a simulation is complete. It will save pending data to the output directory.

It returns a Path object pointing to the HTML file that has been saved in the directory pointed by self.base_path.

generate_spin2ecl_quaternions(scanning_strategy: Union[None, litebird_sim.scanning.ScanningStrategy] = None, imo_url: Union[None, str] = None, delta_time_s: float = 60.0)

Simulate the motion of the spacecraft in free space

This method computes the quaternions that encode the evolution of the spacecraft’s orientation in time, assuming the scanning strategy described in the parameter scanning_strategy (an object of a class derived by ScanningStrategy; most likely, you want to use SpinningScanningStrategy).

You can choose to use the imo_url parameter instead of scanning_strategy: in this case, it will be assumed that you want to simulate a nominal, spinning scanning strategy, and the object in the IMO with address imo_url (e.g., /releases/v1.0/satellite/scanning_parameters/) describing the parameters of the scanning strategy will be loaded. In this case, a SpinningScanningStrategy object will be created automatically.

The parameter delta_time_s specifies how often should quaternions be computed; see ScanningStrategy.generate_spin2ecl_quaternions() for more information.

write_healpix_map(filename: str, pixels, **kwargs) → str

Save a Healpix map in the output folder

Parameters
  • filename (str or pathlib.Path) – Name of the file. It must be a relative path, but it can include subdirectories.

  • pixels – array containing the pixels, or list of arrays if you want to save several maps into the same FITS table (e.g., I, Q, U components)

Returns

A pathlib.Path object containing the full path of the FITS file that has been saved.

Example:

import numpy as np

sim = Simulation(base_path="/storage/litebird/mysim")
pixels = np.zeros(12)
sim.write_healpix_map("zero_map.fits.gz", pixels)

This method saves an Healpix map into a FITS files that is written into the output folder for the simulation.

litebird_sim.simulations.get_template_file_path(filename: Union[str, pathlib.Path]) → pathlib.Path

Return a Path object pointing to the full path of a template file.

Template files are used by the framework to produce automatic reports. They are produced using template files, which usually reside in the templates subfolder of the main repository.

Given a filename (e.g., report_header.md), this function returns a full, absolute path to the file within the templates folder of the litebird_sim source code.