API Reference

Randomly generate populations. <https://github.com/alexanderharms/simago>

Population

Functions around the PopulationClass object.

class simago.population.PopulationClass(popsize, random_seed=None)

Bases: object

Class for the population.

Parameters
  • popsize (int) – Size of population.

  • random_seed (int) – Seed for random number generation. Defaults to None.

random_seed

Seed for random number generation.

Type

int

popsize

Size of the population.

Type

int

prob_objects

List of ProbabilityClass objects.

Type

list

population

DataFrame containing the generated population.

Type

Pandas DataFrame

add_property(ProbClass)

Adds a ProbabilityClass object to the PopulationClass object.

Parameters

ProbClass (ProbabilityClass object) – ProbabilityClass object for the property.

export(output, nowrite=False)

Exports the generated population from PopulationClass.population. The population can either be printed to screen or written to a CSV file.

Parameters
  • output (string) – Path and filename for the CSV file.

  • nowrite (boolean) – If True, the population will only be printed to the command line and not written to file. Defaults to False.

get_conditional_population(property_name, cond_index)

Gets the population corresponding to the conditions supplied by the condition index for a certain property.

Parameters
  • property_name (string) – Name of property to be considered.

  • cond_index (int) – Index of one of the conditions defined for the property.

Returns

population_cond – DataFrame of the population that satisfies the condition.

Return type

DataFrame

remove_property(property_name)

Removes a ProbabilityClass object from the PopulationClass object.

Parameters

property_name (string) – Name of property to be removed.

update(property_name='all')

Updates properties for the population by drawing new values.

Parameters

property_name (string) – Name of property to be updated. Defaults to ‘all’ which updates all of the properties defined for in the PopulationClass instance.

simago.population.construct_query_string(property_name, option, relation)

Construct query string for Pandas .query for the relations defined in the conditions file. For more information on the conditions file see the ‘File Properties’ section in the documentation.

Parameters
  • property_name (str) –

  • option (int) –

  • relation (str) –

Returns

query_string – String to be used in the Pandas .query function.

Return type

str

simago.population.generate_population(popsize, yaml_folder, rand_seed=None)

Generate population.

Parameters
  • popsize (int) – Size of population.

  • yaml_folder (string) – Folder with settings YAML files.

  • random_seed (int) – Seed for random number generation.

Returns

Return type

PopulationClass object

Probability

Classes and functions surrounding the ProbabilityClass objects.

class simago.probability.ContinuousProbabilityClass(yaml_object)

Bases: simago.probability.ProbabilityClass

ContinuousProbabilityClass contains attributes and methods surrounding the properties with continuous probability distributions.

draw_values(pop_obj)

Draw values for continuous variables.

Parameters

pop_obj (PopulationClass) –

Returns

population – DataFrame containing (new) column for the property with newly drawn values.

Return type

DataFrame

class simago.probability.DiscreteProbabilityClass(yaml_object)

Bases: simago.probability.ProbabilityClass

DiscreteProbabilityClass contains attributes and methods surrounding the properties with discrete probability distributions.

draw_values(pop_obj)

Draw values for discrete, i.e. categorical and ordinal, variables.

Parameters

pop_obj (PopulationClass) –

Returns

population – DataFrame containing (new) column for the property with newly drawn values.

Return type

DataFrame

generate_probabilities()

Convert the data to a discrete probability distribution.

read_data(data_file)

Read in data for discrete probability distributions.

Parameters

data_file (string) – Filename for the CSV file.

class simago.probability.ProbabilityClass(yaml_object)

Bases: abc.ABC

Abstract base class; inherited versions of this class contain attributes and methods surrounding the properties of the population and their stochastic behaviour.

Parameters

yaml_object (dict) – Dictionary containing the checked information from a settings file.

property_name

Unique name of property.

Type

string

data_type
Type

string

conditions
Type

DataFrame

read_conditions(conditions_file)

Reads and checks the conditions file from a CSV file.

Parameters

conditions_file (string) – Filename for the CSV file.

simago.probability.check_comb_conditions(probab_objects)

Checks the ProbabilityClass objects for impossible situations, e.g. properties that are dependent on non-defined properties.

Parameters

probab_objects (list of ProbabilityClass objects) – List of objects to be checked.

simago.probability.draw_from_cont_distribution(pdf, parameters, size, random_seed)

Draw from a continuous distribution.

Parameters
  • pdf (function) – Function that returns an rv_continuous object.

  • parameters (list) – List of parameters for the probability distribution function.

  • size (int) – Number of values drawn from distribution.

  • random_seed (int) – Seed for random number generation.

Returns

drawn_values – List of values drawn from the probability distribution function.

Return type

list

simago.probability.draw_from_disc_distribution(probabs, size, random_seed)

Draw from a discrete distribution.

Parameters
  • probabs (Pandas DataFrame) – DataFrame containing the discrete probability distribution.

  • size (int) – Number of values drawn from distribution.

  • random_seed (int) – Seed for random number generation.

Returns

drawn_values – List of drawn values.

Return type

list

simago.probability.order_probab_objects(probab_objects)

Orders ProbabilityClass objects so all properties that do not depend on others are handled first.

Parameters

probab_objects (list of ProbabilityClass objects) – List of objects to be ordered.

Returns

probab_objects – Ordered list of objects.

Return type

list of ProbabilityClass objects

Yamlutils

Functions for finding, checking and loading of the YAML configuration files.

simago.yamlutils.adjust_filenames(yaml_object)

Adjust filenames of the data files.

Parameters

yaml_object (dict) – Dictionary with the information from the YAML file.

Returns

yaml_object – Dictionary with the changed filename paths.

Return type

dict

simago.yamlutils.check_yaml(yaml_object)

Check YAML object.

Using a number of assertions each YAML object, the dictionary derived from the YAML file, is checked if it is complete and if the defined variables are in the correct format.

Parameters

yaml_object (dict) – Dictionary with the information from the YAML file.

Returns

yaml_object – Dictionary with the checked information from the YAML file.

Return type

dict

simago.yamlutils.find_yamls(yaml_folder)

Find YAML files from a folder.

Gather all the YAML files for the aggregated data in the folder. Return the list of paths to YAML files.

Parameters

yaml_folder (str) – Name of folder where YAML files are stored.

Returns

yaml_filenames – List of YAML filenames.

Return type

list of str

simago.yamlutils.load_yamls(yaml_filenames)

Load YAML files.

Loads the YAML configuration files and converts them to dictionaries using yaml package. Checks if the imported YAML files contain the correct information using the function check_yaml.

Parameters

yaml_filenames (list of str) – List of YAML filenames.

Returns

yaml_objects – List of YAML objects.

Return type

list of dict