Gobierto Budgets Data

Gobierto Budgets Data is a gem containing a series of shared methods and classes used to manipulate and import budget lines.

Classes used to import budget lines

GobiertoBudgetsData::GobiertoBudgets::BudgetLinesCsvImporter

This is the main class used to import data from a CSV with the format documented here

The class must be initialized with a CSV and has an import! method which processes the CSV in some steps:

  • Initializes an instance of GobiertoBudgetsData::GobiertoBudgets::BudgetLineCsvRow with each row and stores them in a @rows array
  • Detects the rows with the codes at its last level (there are no other rows with the code of this row as parent code) and for each one accumulates the values of the initial_value, modified_value and executed_value columns grouped by year, organization_id, area_name and kind:
    • If the row belongs to economic functional or custom area names the values are accumulated for each parent code of the main code colum.
    • If the row belongs to special economic-custom or economic-functional area names then the values are accumulated for the main code at the first level and each parent code of custom or functional code. If the area is economic-custom the hierarchy of parent codes and levels present in the CSV is taken into account.
  • Once calculated all the accumulated values an array of new rows is created for each one with different year, organization_id, area_name, kind, code and custom_code / functional_code if makes sense. The new rows are compared with the existing ones and replaces their values when duplicates are found.
  • Once removed the duplicates the initial rows and the new ones are joined and parsed to upload in the required Elasticsearch. This format is provided by the GobiertoBudgetsData::GobiertoBudgets::BudgetLineCsvRow class by a data method which generates one entry for each index.

Custom area

In addition to the main budget line areas, economic and functional, there is a third optional custom area named custom which accepts an arbitrary classification and hierarchy of codes. To import correctly the data of custom area the codes and their hierarchy must be provided in the CSV, using the code, parent_code and level columns. The importer looks for the custom codes in their last levels and aggregates the values searching the parent codes hierarchy. If there are more than 2 levels the higher levels codes, their parents and level number should be included in the CSV (the values can be empty since they will be calculated later in the aggregations). Here is an example of the format of budget lines of custom area

Economic decomposition of functional and custom budget lines

Functional and custom lines amounts of expense kind can be decomposed by their economic nature distribution.

For example in https://madrid.gobierto.es/presupuestos/partidas/13/2020/functional/G there is the functional code 13 "Seguridad y movilidad ciudadana" with their distribution of functional subcodes of next level (in "In this budget line") and also the economic decomposition of it for the first level of economic codes (in "Expense budget lines distribution")

To import this information in the CSV use the economic-functional and economic-custom values in area column, use the economic code in the code column and the functional or custom code in the respective functional_code or custom_code depending on the area. The importer will store the line and calculate the accumulated value for the highest level of economic code for the functional or custom code itself and all their parent codes. All these entries will be saved in Elasticsearch with economic area in the _type attribute but will include a functional_code or custom_code depending on the decomposition and will use an special format for _id:

  • The normal id format is the concatenation of values separated with /: organization_id/year/code/kind
  • For a decomposition of economic-functional the concatenation differs: organization_id/year/functional_code/code/f/kind
  • For a decomposition of economic-custom the concatenation differs: organization_id/year/custom_code/code/c/kind

GobiertoBudgetsData::GobiertoBudgets::CustomCategoriesCsvImporter

Use this class to import names of custom area codes. The class instances must be initialized with a CSV with the format documented here and a hash of settings:

  • site: Mandatory. The site the categories belongs to.
  • locale: Optional. The locale of the categories names. If no locale is provided the default of the site is used

To create the categories in Gobierto database call import! method.

GobiertoBudgetsData::GobiertoBudgets::BudgetLinesSicalwinImporter

This class implements an importer in Sicalwin file format, provided as a CSV file. The class must be initialized with a path to the file, the organization id and the year. These last two attributes are necessary because in the Sicalwin file there's no information about the date nor the entity of the data.

To suppor the data loading there are two auxiliar classes:

  • BudgetLineSicalwinRow, which wraps around each row of the CSV
  • SicalwinBudgetLineProcessor, which uses the sicalwin row instances to create the hierarchy of budget lines that can be obtained from each row.

Given a row like this:

",,,",Alias,Org.,Pro.,Eco.,Descripción,Créditos Iniciales,Modificaciones de Crédito,Remanentes Incorporados,Créditos Totales consignados,Obligaciones Reconocidas
,,00000,91200,10000,Retribuciones básicas de Altos Cargos - ORGANOS DE GOBIERNO,"1,080,498.67",0.00,0.00,"1,080,498.67","758,621.73"

These budget lines are imported:

  • economic budget lines 100, 10 and 1
  • functional budget lines 912, 91 and 9
  • custom budget line 00000-91200-10000
  • economic functional budget line: 921-100, 921-10, 921-1, 92-100, 92-10, 92-1, 9-100, 9-10, 9-1
  • economic custom budget line: 00000-91200-10000-100, 00000-91200-10000-10, 00000-91200-10000-1