Gobierto Data / Examples
Filters for resources with custom fields
The resources with custom fields defined accepts a filter parameter in the index action. This is applicable both to api/v1/data/datasets
and gobierto_investments/api/v1/projects
.
The structure of a filter parameter is:
filter[CUSTOM_FIELD_UID][ [OPERATOR] ]=VALUE[ ,VALUE ... ][ &filter[CUSTOM_FIELD_UID][ [OPERATOR] ]=VALUE[ & ... ] ]
- When no operator is included the
eq
operator is assumed. - Depending on the type of custom field the value must be:
- For custom fields of type vocabulary the value is expected to be a term id
- For custom fields of type date the value must have
YYYY-MM-DD
format
- The available operators are:
eq
: Custom field value must equal the value.in
: The values section can be a list of values separated by commas. The custom field value must be in the values listgt
: Custom field value must be greater than the value.gteq
: Custom field value must be greater or equal the value.lteq
: Custom field value must be less or equal the value.lt
: Custom field value must be less than the value.like
: Custom field value is searched using the ilike SQL condition (case insensitive).
Filter Examples
- Datasets with category (vocabulary type) associated to term with id 1 and cost of 750000:
GET /api/v1/data/datasets?filter[category]=1&filter[cost]=750000
- Datasets with category associated to term with id 1:
GET /api/v1/data/datasets?filter[category][eq]=1
- Datasets with start date between 2019/01/01 (included) and 2020/01/01 (not included):
GET /api/v1/data/datasets?filter[start-date][gteq]=2019-01-01&filter[start-date][lt]=2020-01-01
- Datasets with cost of 100000 and 250000
GET /api/v1/data/datasets?filter[cost][in]=100000,250000
- Datasets with description including the term culture:
GET /api/v1/data/datasets?filter[description][like]=%culture%
Create datasets
There are three ways to create/update a datasets loading data from a CSV, depending on how the CSV is obtained:
- The CSV can be available remotely with a URL
- The CSV can be available in a filesystem the API has access to
- The CSV is in the client side and must be uploaded with the API request
The two first cases use an application/json
body as many of the other API requests. The schema of the JSON to create a dataset is:
{
"data": {
"type": "gobierto_data-dataset_forms",
"attributes": {
"name": ···,
"table_name": ···,
"slug": ···,
"data_path": ···,
"local_data": ···,
"csv_separator": ···,
"schema": ···
}
}
}
The attributes are described in this guide
Examples:
Suppose the admin has XXXXXXXXXX
token
From URL
For example, we want to create a dataset loading data from https://gishubdata.nd.gov/sites/default/files/NDHUB.Roads_MileMarkers_1.csv. This csv has comma as separator by default, so we don't need to provide a csv_separator option (the default is ",")
curl --location --request POST 'https://···/api/v1/data/datasets' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer XXXXXXXXXX' \
--data-raw '{
"data": {
"type": "gobierto_data-dataset_forms",
"attributes": {
"name": "Example 1",
"table_name": "example_1_records",
"data_path": "https://gishubdata.nd.gov/sites/default/files/NDHUB.Roads_MileMarkers_1.csv",
"local_data": false
}
}
}'
From URL, using schema, csv_separator and defining a slug
The previous request doesn't use schema
or csv_separator
options. Once inspected the content of the CSV we have a guess of the data types and we decide to include a schema for the columns with types other than text:
curl --location --request POST 'https://···/api/v1/data/datasets' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer XXXXXXXXXX' \
--data-raw '{
"data": {
"type": "gobierto_data-dataset_forms",
"attributes": {
"name": "Example 2",
"table_name": "example_2_records",
"slug": "example-2-slug",
"data_path": "https://gishubdata.nd.gov/sites/default/files/NDHUB.Roads_MileMarkers_1.csv",
"local_data": false,
"csv_separator": ",",
"schema": {
"objectid": {
"original_name": "OBJECTID",
"type": "integer"
},
"hwy": {
"original_name": "HWY",
"type": "integer"
},
"created_date": {
"original_name": "CREATED_DATE",
"type": "date",
"optional_params": {
"date_format": "YYYYMMDDHH24MISS"
}
},
"last_edited_date": {
"original_name": "LAST_EDITED_DATE",
"type": "date",
"optional_params": {
"date_format": "YYYYMMDDHH24MISS"
}
},
"route_id_rims": {
"original_name": "ROUTE_ID_RIMS",
"type": "integer"
},
"fromdate": {
"original_name": "FROMDATE",
"type": "date",
"optional_params": {
"date_format": "YYYYMMDD"
}
},
"todate": {
"original_name": "TODATE",
"type": "date",
"optional_params": {
"date_format": "YYYYMMDD"
}
},
"measure": {
"original_name": "MEASURE",
"type": "numeric"
}
}
}
}
}'
From a local file
Suppose that the API is installed in a server which has a csv in its filesystem at path /home/ubuntu/2008_10k.csv
. We can load the csv into a new dataset changing the attribute local_data
to true
:
curl --location --request POST 'https://···/api/v1/data/datasets' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer XXXXXXXXXX' \
--data-raw '{
"data": {
"type": "gobierto_data-dataset_forms",
"attributes": {
"name": "Example 3",
"table_name": "example_3_records",
"data_path": "/home/ubuntu/2008_10k.csv",
"local_data": true
}
}
}'
Update datasets
From a local file, replacing all content
If we want to load new data and regenerate the table this is the default behaviour of the update action. The endpoint is the same as used for show. If our dataset has example-3
slug:
curl --location --request PUT 'https://···/api/v1/data/datasets/example-3' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer XXXXXXXXXX' \
--data-raw '{
"data": {
"type": "gobierto_data-dataset_forms",
"attributes": {
"data_path": "/home/ubuntu/2008_10k.csv",
"local_data": true
}
}
}'
From a local file, appending content
Appends all content from the file to the table.
You have to set the attribute append
to true (false by default).
This operation will raise an error if the schema of the CSV is not compatible with the previously existing data.
curl --location --request PUT 'https://···/api/v1/data/datasets/example-3' \
--header 'Content-Type: application/json' \
--header 'Authorization: Bearer XXXXXXXXXX' \
--data-raw '{
"data": {
"type": "gobierto_data-dataset_forms",
"attributes": {
"data_path": "/home/ubuntu/2008_100k.csv",
"local_data": true,
"append": true
}
}
}'
Using multipart/form-data
Create a dataset uploading a CSV using multipart/form-data
The multipart/form-data format allows us to upload CSVs an schema files directly. Suppose that there is a local CSV with path /local/path/file.csv
. In this kind of request the same options of the previous examples can be sent under dataset[attribute_name]
, with the exception of data_path
and local_data
which not make sense in this context. The file location must be sent with dataset[data_file]
:
curl --location --request POST 'https://···/api/v1/data/datasets' \
--header 'Authorization: Bearer XXXXXXXXXX' \
--header 'Content-Type: multipart/form-data' \
--form 'dataset[data_file]=@/local/path/file.csv' \
--form 'dataset[name]=Example 4' \
--form 'dataset[table_name]=example_4_records' \
--form 'dataset[slug]=example-4-slug' \
--form 'dataset[csv_separator]=,'
Update a dataset uploading a CSV and a schema file using multipart/form-data
Suppose that we have the schema in /local/path/schema.json
and the data in /local/path/file.csv
and we want to append data to the dataset of the previous example:
curl --location --request PUT 'https://mataro.gobierto.test/api/v1/data/datasets/example-5-superslug' \
--header 'Authorization: Bearer XXXXXXXXXX' \
--header 'Content-Type: multipart/form-data' \
--form 'dataset[data_file]=@/local/path/file.csv' \
--form 'dataset[schema_file]=@/local/path/schema.json' \
--form 'dataset[append]=true' \
--form 'dataset[csv_separator]=,'
The schema can also be passed as a string containing a JSON with dataset[schema]
. If both dataset[schema_file]
and dataset[schema]
are sent, the second option is ignored and the schema is taken from the uploaded file.
Updated over 1 year ago