Chatbot LLMs Gatotkaca.AI #Part 2.2 • Share SadewaWicak25

Goal

After retrieving data from the API, we’ll restructure it for seamless storage in our PostgreSQL database.

Actions

Let’s dive into our .json file and transform it into a data masterpiece! We’ll use pandas to loop through the file, convert it into a structured dataframe, and then clean up any missing or messy data with -1. Think of pandas as your personal data magician, making sure everything is in tip-top shape before we store it in our database.

import json
import pandas as pd

wmo_code_to_conditions = {
    -1: 'Nothing to report',
    0: 'Clear sky',
    1: 'Mainly clear, partly cloudy, and overcast',
    2: 'Mainly clear, partly cloudy, and overcast',
    3: 'Mainly clear, partly cloudy, and overcast',
    45: 'Fog and depositing rime fog',
    48: 'Fog and depositing rime fog',
    51: 'Drizzle: Light, moderate, and dense intensity',
    53: 'Drizzle: Light, moderate, and dense intensity',
    55: 'Drizzle: Light, moderate, and dense intensity',
    56: 'Freezing Drizzle: Light and dense intensity',
    57: 'Freezing Drizzle: Light and dense intensity',
    61: 'Rain: Slight, moderate and heavy intensity',
    63: 'Rain: Slight, moderate and heavy intensity',
    65: 'Rain: Slight, moderate and heavy intensity',
    66: 'Freezing Rain: Light and heavy intensity',
    67: 'Freezing Rain: Light and heavy intensity',
    71: 'Snow fall: Slight, moderate, and heavy intensity',
    73: 'Snow fall: Slight, moderate, and heavy intensity',
    75: 'Snow fall: Slight, moderate, and heavy intensity',
    77: 'Snow grains',
    80: 'Rain showers: Slight, moderate, and violent',
    81: 'Rain showers: Slight, moderate, and violent',
    82: 'Rain showers: Slight, moderate, and violent',
    85: 'Snow showers slight and heavy',
    86: 'Snow showers slight and heavy',
    95: 'Thunderstorm: Slight or moderate',
    96: 'Thunderstorm with slight and heavy hail',
    99: 'Thunderstorm with slight and heavy hail'
}

for index in range(len(indProv)):
  capital = indProv[index]['ibukota']

  with open(f'temp/json_d/{index+1}_{capital}.json', 'r') as json_file:
    data_json = json.load(json_file)
    # print(data['daily'])

  df_daily = pd.DataFrame(data_json['daily'])
  nan_count_per_column = df_daily.isna().sum()

  df_init = df_daily.copy()
  df_init.fillna(-1, inplace = True)
  df_init['city'] = capital
  df_init['name_weather_code'] = df_init['weather_code'].map(wmo_code_to_conditions)
      
  # Write the DataFrame to a CSV file
  df_init.to_csv(f'temp/csv/{index+1}_{capital}.csv', index = False)

The are two tools you need to install before, saved our data into database. That tools are psycopg2 and sqlalchemy. psycopg2 is a Python library used for interacting with PostgreSQL databases. It provides a simple interface for executing SQL queries, fetching results, and managing database connections. SQLAlchemy is another popular Python library for interacting with databases, including PostgreSQL. It offers a more object-oriented approach to database interactions, providing features like declarative mapping and ORM (Object-Relational Mapping).

import pandas as pd
import psycopg2
from psycopg2 import sql
from sqlalchemy import create_engine


# Database connection parameters
db_params = {
    'dbname': '_',
    'user': '_',
    'password': '_',
    'host': 'localhost',
    'port': '5432'
}

# Connect to the PostgreSQL database
# Create the connection string
connection_string = f"postgresql+psycopg2://{db_params['user']}:{db_params['password']}@{db_params['host']}:{db_params['port']}/{db_params['dbname']}"

# Create the SQLAlchemy engine
engine = create_engine(connection_string)

# Define the table name
table_name = 'daily_weather_prov_indonesia'

df_init.to_sql(table_name, engine, if_exists='append', index=False)

# Close the connection
engine.dispose()

It’s time to bridge the gap between our dataframe and our database. We’re converting the dataframe’s structure into SQL syntax.