Published
- 3 min read
Chatbot LLMs Gatotkaca.AI #Part 2.1
Goal

At this stage, our goal is to build a knowledge-based database that will serve as the foundation for our LLMs. A knowledge base in the context of Large Language Models (LLMs) refers to a structured repository of information that the model can access or be trained on to enhance its ability to generate accurate, relevant, and contextually appropriate responses. In short terms, knowledge base is the central information for our LLMs to interact with the user.
Step
Pulldata from theAPIusingrequestsNormalizethe data usingnumpyandpandasLoadto our databases
Actions
API which will be our foundation of information knowledge based is from Open-Meteo. Open-Meteo is an open source weather API which is free for non-commercial use. To grab our dataset informations of weather, we need to specify the longitude and latitude location.
Longitude and latitude are coordinates used to specify locations on Earth’s surface.
Latitudemeasures how far north or south a point is from the Equator, ranging from 0° at the Equator to 90° at the poles.Longitudemeasures how far east or west a point is from the Prime Meridian, ranging from 0° at the Prime Meridian to 180° east or west.

Before we started to GET the data longitude and latitude, we need to load our province capital in the previous part using this command
import numpy as np
# Load file
indProv = np.load('location file.npy', allow_pickle = True)
In the next step, we create a new function called get_json_from_capital to load json data from the API URL.
import requests
def get_json_from_capital(value):
url = f"https://geocoding-api.open-meteo.com/v1/search?name={value}&count=1&language=en&format=json"
response = requests.get(url)
return response.json()
To implement the function get_json_from_capital add this code
result_json = []
for i in indProv:
result_json.append(get_json_from_capital(i['ibukota']))

Flow our process on the code above is like this

To combine the variable result_json with our master province capital variable (indProv), we need to iterate the object and add key inside the indProv.
for index in range(len(result_json)):
json_string = json.dumps(result_json[index])
data = json.loads(json_string)
if 'results' in data and len(data['results']) > 0:
res = data['results'][0]
country = res['country']
latitude = res['latitude']
longitude = res['longitude']
indProv[index]['country'] = country
indProv[index]['latitude'] = latitude
indProv[index]['longitude'] = longitude
else:
print(index)

Don’t give up—if you feel confused or don’t fully understand, it’s a sign that you’re still learning and making an effort to grasp it.
The range date we want to get the information weather API is in range between 1st January 2023 and 17 August 2024. Create a new function called get_weather_from_longlat.
import requests
def get_weather_from_longlat(longitude, latitude):
url = f"https://api.open-meteo.com/v1/forecast?latitude={latitude}&longitude={longitude}&daily=weather_code,temperature_2m_max,temperature_2m_min,sunrise,sunset,daylight_duration,sunshine_duration,wind_speed_10m_max,wind_direction_10m_dominant&timezone=Asia%2FBangkok&start_date=2023-01-01&end_date=2024-08-18"
response = requests.get(url)
return response.json()
Then, we can implement that function using code below. We iterate the data indProv then, create a new file .json to store the result information from the API Open-Meteo.
import time
for index in range(len(indProv)):
weather_data = get_weather_from_longlat(indProv[index]['longitude'], indProv[index]['latitude'])
capital = indProv[index]['ibukota']
with open(f'temp/json_history/{index+1}_{capital}.json', 'w') as json_file:
json.dump(weather_data, json_file, indent = 4)
# delay for 90 seconds
time.sleep(90)
