- Introduction
- Database Structure
- Data Flow for Query Construction
- Setup Instructions
- Usage Examples
- Important Notes
- Contributing
- License
The Average Price Data (AP) from the Bureau of Labor Statistics (BLS) provides detailed information on average consumer prices for household fuels, motor fuels, and food items. Collected monthly across various urban areas in the United States, this data is crucial for measuring the price levels of specific items over time and across different regions.
This repository contains scripts and a database schema to set up and manage a local SQLite database for storing and querying the AP data. It includes tools for downloading the latest data from the BLS website and fetching Consumer Price Index (CPI) data via the BLS API.
The database comprises several tables that store data about items, areas, periods, series, and the actual price observations. Understanding the schema and relationships between these tables is crucial for constructing accurate SQL queries and extracting meaningful insights.
-
ap_item
- Purpose: Stores information about the items for which average prices are recorded.
- Fields:
item_code
(TEXT, PRIMARY KEY): Unique identifier for each item.item_name
(TEXT): Descriptive name of the item.
- Example Entries:
701111
: Flour, white, all purpose, per lb. (453.6 gm)702111
: Sugar, white, all sizes, per lb. (453.6 gm)
-
ap_area
- Purpose: Contains information about the geographic areas covered in the survey.
- Fields:
area_code
(TEXT, PRIMARY KEY): Unique identifier for each area.area_name
(TEXT): Descriptive name of the area.
- Example Entries:
0000
: U.S. city averageA100
: Northeast UrbanS200
: South Urban
-
ap_period
- Purpose: Defines the periods (months) for which data is collected.
- Fields:
period
(TEXT, PRIMARY KEY): Code representing the period (e.g., M01 for January).period_abbr
(TEXT): Abbreviation of the period name (e.g., JAN).period_name
(TEXT): Full name of the period (e.g., January).
- Example Entries:
M01
: JAN, JanuaryM02
: FEB, February
-
ap_series
- Purpose: Provides metadata about each time series, linking items and areas.
- Fields:
series_id
(TEXT, PRIMARY KEY): Unique identifier for each time series.area_code
(TEXT): Referencesap_area.area_code
.item_code
(TEXT): Referencesap_item.item_code
.series_title
(TEXT): Title describing the series.footnote_codes
(TEXT): Any associated footnotes.begin_year
(INTEGER): First year of data availability.begin_period
(TEXT): First period of data availability.end_year
(INTEGER): Last year of data availability.end_period
(TEXT): Last period of data availability.
- Relationships:
ap_series.area_code
→ap_area.area_code
ap_series.item_code
→ap_item.item_code
-
ap_data_current
- Purpose: Holds current year-to-date average price data.
- Fields:
series_id
(TEXT): Referencesap_series.series_id
.year
(INTEGER): Year of the observation.period
(TEXT): Referencesap_period.period
.value
(REAL): Observed average price.footnote_codes
(TEXT): Any associated footnotes.
- Primary Key:
(series_id, year, period)
- Relationships:
ap_data_current.series_id
→ap_series.series_id
ap_data_current.period
→ap_period.period
-
ap_data_food
- Purpose: Contains average price data for food items.
- Fields and Relationships: Same as
ap_data_current
.
-
ap_data_gasoline
- Purpose: Contains average price data for gasoline.
- Fields and Relationships: Same as
ap_data_current
.
-
ap_data_householdfuels
- Purpose: Contains average price data for household fuels.
- Fields and Relationships: Same as
ap_data_current
.
-
ap_seasonal
- Purpose: Stores information about seasonal adjustment codes.
- Fields:
seasonal_code
(TEXT, PRIMARY KEY): Code indicating seasonal adjustment.seasonal_text
(TEXT): Description of the seasonal code.
Below is the SQL schema used to create the tables:
CREATE TABLE ap_item (
item_code TEXT PRIMARY KEY,
item_name TEXT
);
CREATE TABLE ap_area (
area_code TEXT PRIMARY KEY,
area_name TEXT
);
CREATE TABLE ap_period (
period TEXT PRIMARY KEY,
period_abbr TEXT,
period_name TEXT
);
CREATE TABLE ap_seasonal (
seasonal_code TEXT PRIMARY KEY,
seasonal_text TEXT
);
CREATE TABLE ap_series (
series_id TEXT PRIMARY KEY,
area_code TEXT,
item_code TEXT,
series_title TEXT,
footnote_codes TEXT,
begin_year INTEGER,
begin_period TEXT,
end_year INTEGER,
end_period TEXT
);
CREATE TABLE ap_data_current (
series_id TEXT,
year INTEGER,
period TEXT,
value REAL,
footnote_codes TEXT,
PRIMARY KEY(series_id, year, period)
);
CREATE TABLE ap_data_food (
series_id TEXT,
year INTEGER,
period TEXT,
value REAL,
footnote_codes TEXT,
PRIMARY KEY(series_id, year, period)
);
CREATE TABLE ap_data_gasoline (
series_id TEXT,
year INTEGER,
period TEXT,
value REAL,
footnote_codes TEXT,
PRIMARY KEY(series_id, year, period)
);
CREATE TABLE ap_data_householdfuels (
series_id TEXT,
year INTEGER,
period TEXT,
value REAL,
footnote_codes TEXT,
PRIMARY KEY(series_id, year, period)
);
CREATE TABLE cpi_info (
series_id TEXT,
year INTEGER,
period TEXT,
value REAL,
footnote_codes TEXT,
PRIMARY KEY(series_id, year, period)
);
To construct a query that retrieves specific average price data, follow these steps:
-
Identify the Item:
- Use
ap_item
to find theitem_code
corresponding to the desireditem_name
.
- Use
-
Identify the Area:
- Use
ap_area
to find thearea_code
corresponding to the desiredarea_name
.
- Use
-
Find the Series ID:
- Use
ap_series
to find theseries_id
matching both theitem_code
andarea_code
.
- Use
-
Retrieve Data Observations:
- Use the
series_id
to query the appropriateap_data_*
table (ap_data_food
,ap_data_gasoline
, etc.) for the desiredyear
andperiod
.
- Use the
-
Join Period Information:
- Use
ap_period
to translateperiod
codes into readableperiod_name
values.
- Use
- Python 3.6+
- SQLite3
- pip (Python package installer)
- Virtual Environment (recommended)
# Clone the repository
git clone https://github.com/yourusername/ap-database.git
cd ap-database
# Create a virtual environment (optional but recommended)
python -m venv venv
source venv/bin/activate # On Windows, use venv\Scripts\activate
# Install required Python packages
pip install -r requirements.txt
Run the seed_data.py
script to initialize the database:
python seed_data.py
This script will:
- Create the SQLite database named
average_price_data.db
. - Create all the tables as per the schema.
- Load data from local CSV files into the database.
Use the get_http.py
script to download the necessary data files from the BLS website:
python get_http.py
This script will:
- Download specified files from the BLS FTP site.
- Save them in the
downloads
directory.
Note: Ensure that the downloads
directory exists or will be created by the script.
Use the get_api.py
script to fetch Consumer Price Index (CPI) data via the BLS API:
-
Obtain a BLS API Key:
-
Register at the BLS website to obtain an API key.
-
Store the API key in a
.env
file in the project root:BLS_API_KEY=your_api_key_here
-
-
Run the Script:
python get_api.py
This script will:
- Fetch CPI data for specified
series_id
,start_year
, andend_year
. - Save the data into text files and insert it into the
cpi_info
table in the database.
- Fetch CPI data for specified
To retrieve specific average price data, you can use the following SQL query structure:
SELECT
d.year,
p.period_name,
i.item_name,
a.area_name,
d.value
FROM
ap_data_food AS d
JOIN
ap_series AS s ON d.series_id = s.series_id
JOIN
ap_item AS i ON s.item_code = i.item_code
JOIN
ap_area AS a ON s.area_code = a.area_code
JOIN
ap_period AS p ON d.period = p.period
WHERE
i.item_name = 'Sugar, white, all sizes, per lb. (453.6 gm)'
AND a.area_name = 'U.S. city average'
ORDER BY
d.year, p.period_name;
This query will:
- Retrieve the average price of sugar per pound in U.S. city averages.
- Display the data ordered by year and month.
-
Primary Keys:
- Ensure uniqueness and efficient data retrieval.
-
Foreign Keys:
- Maintain referential integrity between tables.
-
Data Partitioning:
- Data is divided into specific tables based on item categories for optimized access.
-
Understanding Period Codes:
- Monthly Periods:
M01
toM12
represent January to December.
- Annual Averages:
M13
may be used to represent annual average data.
- Monthly Periods:
Contributions are welcome! Please open an issue or submit a pull request for any improvements or bug fixes.
This project is licensed under the MIT License.
Leave a Reply