• Home
  • Basic Rank Tracker Python Script using Plotly connected to Google Search Console API

    Chris

    Administrator
    Staff member
    I've put together a Python script that connects to the Google Search Console API, pulls your search query data, and plots it so you can see trends over time. You'll need a bit of patience with setting up Google's API and some Python packages, namely Plotly.

    1709154362297.jpg


    By the end, you'll have a visual snapshot of your position (rankings), making it easier to spot where you're winning and losing and where there's room to improve. It should be easy enough to tweak for other metrics too. Even if you don't like Plotly, you'll like the reusable Google Search Console API connector.

    Step 1: Preparing Your Environment​

    Note - I have not tested in Colab; I run Python scripts locally or on a cloud server. Preference mostly.
    • Install Python: Ensure Python is installed on your computer, cloud, or whatever.
    • Install Required Libraries: You'll need to install the pandas, plotly, and google-auth-oauthlib libraries, among others mentioned in the script. You can do this by running pip install pandas plotly google-auth-oauthlib statsmodels in your command line or terminal. See below for Pips.
    Code:
    pip install pandas
    pip install plotly
    pip install google-auth-oauthlib
    pip install google-api-python-client
    pip install statsmodels

    Step 2: Setting Up Google Search Console API Access​

    This bit can be a bit tricky and might need a bit of patience. Well worth it, though, when you do.
    • Google Developer Console: Go to the Google Developer Console and create a project.
    • Enable API: Search for the Google Search Console API and enable it for your project.
    • Create Credentials: Create OAuth 2.0 credentials (type: OAuth Client ID). Choose "Desktop app" as the application type.
    • Download Credentials: Download the JSON file containing your credentials. This file is your key to accessing the API.

    Step 3: Preparing the Script​

    I use Windows Powershell to execute the scripts, so if you are using Colab, this might be different. Not tested using Colab (yet). Should be fine on Linux and Mac.
    • Place Your Credentials: Put the downloaded JSON file into a folder named secret in the same directory as your script.
    • Configure the Script: Edit the SITE_URL variable in the script to match the URL of the site you're analysing.

    Step 4: Running the Script​

    FYI - I named the python file plots.py, entirely your decision on naming the file, hence the script_name.py
    • Execute the Script: Open your terminal or command line, navigate to the directory containing the script, and run it by typing python script_name.py, replacing script_name.py with the actual name of your script.
    • Authenticate: On the first run, your web browser will open, asking you to log in with your Google account and grant access to the Google Search Console data. Follow the prompts.
    • Automated Data Fetching: The script will automatically fetch query data from your specified site, aggregate it on a weekly basis, and visualise the search query positions over time.

    What the Script Does:​

    • Fetches Data: It connects to the Google Search Console API to pull data for specific search queries related to your site over a certain period (the example is set to 180 days).
    • Processes Data: The data is aggregated on a weekly basis, calculating the average position of each query per week. (feel free to tweak) Note - it looks messy daily for such a long period as 180 days.
    • Visualises Data: Creates a plot showing the position of each query over time, giving you insights into how well your site is performing in search results.

    The Python Code:​

    Code:
    import datetime
    import os
    import pickle
    import pandas as pd
    import plotly.graph_objs as go
    from google_auth_oauthlib.flow import InstalledAppFlow
    from googleapiclient.discovery import build
    from statsmodels.nonparametric.smoothers_lowess import lowess
    
    # Configuration parameters
    API_VERSION = 'v1'
    SITE_URL = 'https://example.co.uk'  # Replace with the actual site URL
    
    # Function to get credentials file from the 'secret' folder
    def get_credentials_file_from_secret_folder():
        """Retrieve the credentials file from the 'secret' directory."""
        for filename in os.listdir('secret'):
            if filename.endswith('.json'):
                return os.path.join('secret', filename)
        raise FileNotFoundError("No JSON file found in the 'secret' folder.")
    
    # Authenticate with Google Search Console
    def authenticate_gsc():
        """Authenticate with Google Search Console."""
        creds = None
        if os.path.exists('token.pickle'):
            with open('token.pickle', 'rb') as token:
                creds = pickle.load(token)
        if not creds or not creds.valid:
            CREDENTIALS_FILE = get_credentials_file_from_secret_folder()
            flow = InstalledAppFlow.from_client_secrets_file(
                CREDENTIALS_FILE,
                scopes=['https://www.googleapis.com/auth/webmasters.readonly']
            )
            creds = flow.run_local_server(port=8080)
            with open('token.pickle', 'wb') as token:
                pickle.dump(creds, token)
        return build('searchconsole', API_VERSION, credentials=creds)
    
    # Array of keywords
    QUERIES = [
        'query 1',
        'query keyword 2',
        'query term 3',
        'and so on 4',
        'final query without comma'
        # FYI - continue as needed, anything 30 looks messy
    ]
    
    # Function to fetch query data
    def fetch_query_data(service, queries, start_date, end_date):
        all_data = []
        for query in queries:
            request = {
                'startDate': start_date,
                'endDate': end_date,
                'dimensions': ['query', 'date'],
                'dimensionFilterGroups': [{
                    'filters': [{
                        'dimension': 'query',
                        'operator': 'equals',
                        'expression': query
                    }]
                }],
                'rowLimit': 25000
            }
            response = service.searchanalytics().query(siteUrl=SITE_URL, body=request).execute()
            for row in response.get('rows', []):
                all_data.append({
                    'query': row['keys'][0],
                    'date': datetime.datetime.strptime(row['keys'][1], '%Y-%m-%d'),
                    'position': row['position']
                })
        df = pd.DataFrame(all_data)
        df['date'] = pd.to_datetime(df['date'])
        # Resample to weekly frequency, taking the mean position for the week
        df.set_index('date', inplace=True)
        weekly_df = df.groupby('query').resample('W').mean().reset_index()
        return weekly_df
    
    # Function to visualise insights with smoother lines
    def visualize_insights(df):
        fig = go.Figure()
        for query in df['query'].unique():
            query_df = df[df['query'] == query]
            fig.add_trace(go.Scatter(
                x=query_df['date'],
                y=query_df['position'],
                mode='lines',
                name=query,
                line=dict(shape='spline')  # Spline shape for smooth line
            ))
    
        # Update layout
        fig.update_layout(
            title='Search Query Positions Over Time - Aggregated Weekly',
            xaxis_title='Date',
            yaxis_title='Position',
            yaxis=dict(autorange='reversed'),
            hovermode='x unified',
            template='plotly_white'
        )
    
        fig.show()
    
    # Main execution logic
    def main():
        service = authenticate_gsc()
        end_date = datetime.datetime.now().strftime('%Y-%m-%d')
        start_date = (datetime.datetime.now() - datetime.timedelta(days=180)).strftime('%Y-%m-%d')
        query_data = fetch_query_data(service, QUERIES, start_date, end_date)
        query_data['date'] = pd.to_datetime(query_data['date'])
        visualize_insights(query_data)
    
    if __name__ == '__main__':
        main()

    Things to Keep in Mind:​

    • Token File: After the first authentication, a file named token.pickle is created in your script's directory. This file stores your access token, so you won't have to authenticate every time you run the script. You'll thank me for that alone.
    • Modify Queries: The QUERIES list contains the search queries you're interested in. Modify this list to suit your needs.
    • Weekly Data: You may notice that the final week in the series may end with a few days in the future that's because it is set to series the data weekly.

    And that's a wrap, I hope you have fun experimenting with Plotly connected to Google Search Console API. I'm new to it, too, and still experimenting, so by no means do I think I'm an expert. I got ChatGPT to write half of the functions. Even that was a struggle; it's not as reliable as it once was.
     
    Last edited: