Monday, June 5, 2023

The Moz Links API: Touch Every Endpoint in Python

The purpose of this Jupyter Notebook is to introduce the Moz Links API using Python. This should work on any notebook hosting environment, such as Google Colab.

If you’re looking at this on Github, the code snippets can be copy/pasted into your own notebook environment. By the time you’ve run this script to the bottom, you will have used every Moz Links API endpoint, and can pick the parts you want for your own project. The official documentation can be found here.

Confused? Be sure to check out my intro to the Moz Links API.

Do global imports

The import statements at the top of a Python program are used to load external resources that are not loaded by default in the Python interpreter. These resources may include libraries or modules that provide additional functionality to the program.

Import statements are usually placed at the top of a program, before any other code is executed. This allows the program to load any necessary resources before they are needed in the program.

Once the resources have been loaded using import statements, they can be used anywhere in the program, not just in the cell where the import statement was written. This allows the program to access the functionality provided by the imported resources throughout its execution.

The libraries here not part of the standard Python library are requests and sqlitedict. You can install the with pip install requests and pip install sqlitedict in your terminal or a Jupyter cell. If you’re using Anaconda, requests is pre-installed.

import json
import requests
from headlines import *
from pprint import pprint
from sqlitedict import SqliteDict as sqldict

Load login values from external file

The code below reads a file named “linksapi.txt” from the “assets” directory, which contains the login credentials, including the access ID and secret key needed to access the Moz API. These credentials are extracted from the file and assigned to two variables named ACCESSID and SECRETKEY. The with statement is used to ensure that the file is properly closed after it’s been read. Create a file whose contents look like this with your credentials manually retreived from moz.com:

ACCESSID: mozscape-1234567890
SECRETKEY: 1234567890abcdef1234567890abcdef

Once the credentials are extracted from the file, they are stored in a tuple named AUTH_TUPLE. This tuple can be used as an argument to the Moz API functions to authenticate and authorize access to the data.

The purpose of this approach is to avoid hard-coding sensitive login credentials directly in the program, which could pose a security risk if the code was shared or published publicly. Instead, the credentials are kept in a separate file that is not included in the repository, and can be easily created and updated as needed. This way, the code can be shared without exposing the credentials to the public.

with open("../assets/linksapi.txt") as fh:
    ACCESSID, SECRETKEY = [x.strip().split(" ")[1] for x in fh.readlines()]

AUTH_TUPLE = (ACCESSID, SECRETKEY)  # Don't show contents

Configure variables

In this code, there are several configuration variables that are used to set up the API call to the Moz Links API.

The first variable, COMMON_ENDPOINT, is a constant that stores the endpoint URL for the Moz API. The second variable, sub_endpoint, is a string that represents the endpoint subpath for the anchor text data, which will be appended to the COMMON_ENDPOINT URL to form the complete API endpoint URL.

The fourth variable, data_dict, is a dictionary that contains the parameters for the API request. In this case, the data_dict specifies the target URL for which we want to retrieve anchor text data, the scope of the data (in this case, page-level), and a limit of 1 result.

Finally, the json_string variable is created by converting the data_dict dictionary into a JSON-formatted string using the json.dumps() function. This string will be used as the request body when making the API call.

These variables are used to configure and parameterize the API request, and can be modified to perform any data_dict request against any Moz Links API sub_endpoint.

COMMON_ENDPOINT = "https://lsapi.seomoz.com/v2/"
sub_endpoint = "anchor_text"
endpoint = COMMON_ENDPOINT + sub_endpoint
data_dict = {"target": "moz.com/blog", "scope": "page", "limit": 1}
json_string = json.dumps(data_dict)

Actually hit the API (ensure success)

In JupyterLab, the last line of a code cell is automatically printed to the output area without requiring an explicit print() statement. The code you provided is using the requests module to send a POST request to a URL url with data in the form of a JSON string json_string. The authentication details are passed using the AUTH_TUPLE variable.

After sending the request, the response object r is printed using the print() statement. This will print the HTTP status code, such as 200 for success, 404 for not found, etc., along with the response headers.

Finally, the .json() method is called on the response object response to parse the response data as JSON and return it as a Python dictionary. This dictionary can be assigned to a variable, used for further processing, or simply printed to the output area without requiring an explicit print() statement due to JupyterLab’s automatic printing behavior for the last line of a code cell.

response = requests.post(endpoint, data=json_string, auth=AUTH_TUPLE)
pprint(response.json())

Outputs:

{'next_token': 'JYkQVg4s9ak8iRBWDiz1qTyguYswnj035nqjRF0IbW96IGJsb2e58hGzcmSomw==',
 'results': [{'anchor_text': 'moz',
              'external_pages': 7183,
              'external_root_domains': 2038}]}

List Sub-endpoints

This code defines a list of different sub-endpoints that can be appended to a common URL prefix to make different API endpoints. An API endpoint is a URL where an API can be accessed by clients. It is a point of entry to the application that acts as a gatekeeper between the client and the server. Each endpoint is identified by a unique URL, which can be used to interact with the API.

In this code, the list of sub-endpoints is defined in the sub_endpoints variable, and each endpoint is represented as a string. The for loop iterates over the list, prints the index number and name of each sub-endpoint using the print function, and increments the index by 1. The enumerate function is used to generate a sequence of pairs consisting of an index and a value from the list.

This code is useful for exploring the available endpoints for a particular API and for selecting the endpoint that corresponds to the desired functionality. By changing the sub-endpoint in the URL, clients can access different resources or perform different operations on the server.

sub_endpoints = [
    "anchor_text",
    "final_redirect",
    "global_top_pages",
    "global_top_root_domains",
    "index_metadata",
    "link_intersect",
    "link_status",
    "linking_root_domains",
    "links",
    "top_pages",
    "url_metrics",
    "usage_data",
]
for i, sub_endpoint in enumerate(sub_endpoints):
    print(i + 1, sub_endpoint)

Outputs:

1 anchor_text
2 final_redirect
3 global_top_pages
4 global_top_root_domains
5 index_metadata
6 link_intersect
7 link_status
8 linking_root_domains
9 links
10 top_pages
11 url_metrics
12 usage_data

Human-friendly labels

This code defines two lists: names and descriptions. The names list contains human-friendly labels for the set of sub-endpoints, while the descriptions list provides a brief description of each endpoint. The two lists are kept in the same order as the points list defined earlier in the code.

By keeping the three lists in the same order, they can be “zipped” together into a single list of tuples using the zip function. This produces a new list where each tuple contains the name, endpoint, and description for a particular API endpoint. This makes it easy to display a user-friendly summary of each API endpoint with its name and description.

The zip function combines the elements of the three lists element-wise, creating a tuple of the first elements from each list, then a tuple of the second elements, and so on. The resulting list of tuples can be iterated over, and each tuple unpacked to access the individual name, endpoint, and description elements for each API endpoint.

names = [
    "Anchor Text",
    "Final Redirect",
    "Global Top Pages",
    "Global Top Root Domains",
    "Index Metadata",
    "Link Intersect",
    "Link Status",
    "Linking Root Domains",
    "Links",
    "Top Pages",
    "URL Metrics",
    "Usage Data",
]

descriptions = [
    "Use this endpoint to get data about anchor text used by followed external links to a target. Results are ordered by external_root_domains descending.",
    "Use this endpoint to get data about anchor text used by followed external links to a target. Results are ordered by external_root_domains descending.",
    "This endpoint returns the top 500 pages in the entire index with the highest Page Authority values, sorted by Page Authority. (Visit the Top 500 Sites list to explore the top root domains on the web, sorted by Domain Authority.)",
    "This endpoint returns the top 500 pages in the entire index with the highest Page Authority values, sorted by Page Authority. (Visit the Top 500 Sites list to explore the top root domains on the web, sorted by Domain Authority.)",
    "This endpoint returns the top 500 pages in the entire index with the highest Page Authority values, sorted by Page Authority. (Visit the Top 500 Sites list to explore the top root domains on the web, sorted by Domain Authority.)",
    "Use this endpoint to get sources that link to at least one of a list of positive targets and don't link to any of a list of negative targets.",
    "Use this endpoint to get information about links from many sources to a single target.",
    "Use this endpoint to get linking root domains to a target.",
    "Use this endpoint to get links to a target.",
    "This endpoint returns top pages on a target domain.",
    "Use this endpoint to get metrics about one or more urls.",
    "This endpoint Returns the number of rows consumed so far in the current billing period. The count returned might not reflect rows consumed in the last hour. The count returned reflects rows consumed by requests to both the v1 (Moz Links API) and v2 Links APIs.",
]

# Simple zipping example
list(zip(names, sub_endpoints, descriptions))

Outputs:

[('Anchor Text',
  'anchor_text',
  'Use this endpoint to get data about anchor text used by followed external links to a target. Results are ordered by external_root_domains descending.'),
 ('Final Redirect',
  'final_redirect',
  'Use this endpoint to get data about anchor text used by followed external links to a target. Results are ordered by external_root_domains descending.'),
 ('Global Top Pages',
  'global_top_pages',
  'This endpoint returns the top 500 pages in the entire index with the highest Page Authority values, sorted by Page Authority. (Visit the Top 500 Sites list to explore the top root domains on the web, sorted by Domain Authority.)'),
 ('Global Top Root Domains',
  'global_top_root_domains',
  'This endpoint returns the top 500 pages in the entire index with the highest Page Authority values, sorted by Page Authority. (Visit the Top 500 Sites list to explore the top root domains on the web, sorted by Domain Authority.)'),
 ('Index Metadata',
  'index_metadata',
  'This endpoint returns the top 500 pages in the entire index with the highest Page Authority values, sorted by Page Authority. (Visit the Top 500 Sites list to explore the top root domains on the web, sorted by Domain Authority.)'),
 ('Link Intersect',
  'link_intersect',
  "Use this endpoint to get sources that link to at least one of a list of positive targets and don't link to any of a list of negative targets."),
 ('Link Status',
  'link_status',
  'Use this endpoint to get information about links from many sources to a single target.'),
 ('Linking Root Domains',
  'linking_root_domains',
  'Use this endpoint to get linking root domains to a target.'),
 ('Links', 'links', 'Use this endpoint to get links to a target.'),
 ('Top Pages',
  'top_pages',
  'This endpoint returns top pages on a target domain.'),
 ('URL Metrics',
  'url_metrics',
  'Use this endpoint to get metrics about one or more urls.'),
 ('Usage Data',
  'usage_data',
  'This endpoint Returns the number of rows consumed so far in the current billing period. The count returned might not reflect rows consumed in the last hour. The count returned reflects rows consumed by requests to both the v1 (Moz Links API) and v2 Links APIs.')]

Show an example request for each endpoint

This is a list of API requests in Python dict format, where each dictionary represents a request to a specific endpoint. Don’t hurt your brain too much trying to read it. Just know that I lifted each example from the original Moz documentation and listed them all here in order as nested Python dicts.

You could call the format is a dict of dicts, where each sub-dictionary corresponds to a specific endpoint, same order as the sub_endpoints, names, and descriptions lists for easy combining. The output of running the below cell is doing that list-combining to document every sub_endpoint.

dict_of_dicts = {
    "anchor_text": {"target": "moz.com/blog", "scope": "page", "limit": 5},
    "links": {
        "target": "moz.com/blog",
        "target_scope": "page",
        "filter": "external+nofollow",
        "limit": 1,
    },
    "final_redirect": {"page": "seomoz.org/blog"},
    "global_top_pages": {"limit": 5},
    "global_top_root_domains": {"limit": 5},
    "index_metadata": {},
    "link_intersect": {
        "positive_targets": [
            {"target": "latimes.com", "scope": "root_domain"},
            {"target": "blog.nytimes.com", "scope": "subdomain"},
        ],
        "negative_targets": [{"target": "moz.com", "scope": "root_domain"}],
        "source_scope": "page",
        "sort": "source_domain_authority",
        "limit": 1,
    },
    "link_status": {
        "target": "moz.com/blog",
        "sources": ["twitter.com", "linkedin.com"],
        "source_scope": "root_domain",
        "target_scope": "page",
    },
    "linking_root_domains": {
        "target": "moz.com/blog",
        "target_scope": "page",
        "filter": "external",
        "sort": "source_domain_authority",
        "limit": 5,
    },
    "top_pages": {"target": "moz.com", "scope": "root_domain", "limit": 5},
    "url_metrics": {"targets": ["moz.com", "nytimes.com"]},
    "usage_data": {},
}

for i, sub_endpoint in enumerate(sub_endpoints):
    h1(f"{i + 1}. {names[i]} ({sub_endpoint})")
    print(descriptions[i])
    h4("Example request:")
    pprint(dict_of_dicts[sub_endpoint])
    print()

Outputs:

# 2. Final Redirect (final_redirect)

Use this endpoint to get data about anchor text used by followed external links to a target. Results are ordered by external_root_domains descending.
Example request:

{'page': 'seomoz.org/blog'}

[...]

Write a function that hits the API

If we’re going to hit an API over and over in mostly the same way, we want to spare ourselves re-typing everything all the time. That’s why we define functions. That’s the def in the below cell. Once that cell is run, the moz() function can be used anywhere in this Notebook. You need only feed it the sub_endpoint you want to use and a Python dict of your request. It will return the API’s response.

def moz(sub_endpoint, data_dict):
    """Hits Moz Links API with specified endpoint and request and returns results."""
    json_string = json.dumps(data_dict)
    endpoint = COMMON_ENDPOINT + sub_endpoint
    # Below, data is a string (flattened JSON) but auth is a 2-position tuple.
    response = requests.post(endpoint, data=json_string, auth=AUTH_TUPLE)
    return response

This does not output anything to the screen. It just defines the function.

Conditionally hit the API

The code uses a Python package calledb which provides a persistent dictionary-like object that can be stored on disk using the SQLite database engine. The with statement in the code sets up a context manager for the SqliteDict object, which automatically handles opening and closing the database connection. The database file is stored at ../dbs/linksapi.db

The code iterates through each sub-endpoint in the sub_endpoints list, and checks if that data has already been retrieved. If it hasn’t, the API is called using the moz() function and the result is saved in the SqliteDict. The db.commit() statement ensures that any changes made to the dictionary during the iteration are saved to the database.

The SqliteDict serves as a local cache to prevent the API from being hit every time the code block is run if the data has already been collected. By using this cache, the code reduces the number of API requests required, which is useful when working with APIs that have quota limits. Congratulations, you’re using a database!

with sqldict("../dbs/linksapi.db") as db:
    for sub_endpoint in sub_endpoints:
        if sub_endpoint not in db:
            print(sub_endpoint)
            result = moz(sub_endpoint, dict_of_dicts[sub_endpoint])
            db[sub_endpoint] = result
            db.commit()
            print("API hit and response saved!")
            print()
h2("Done")

This does not output anything to the screen. It saves the results of the API-calls to a local database.

Show the locally-stored API responses

This code uses the sqldict context manager to open the SQLite database containing the previously retrieved API data. It then iterates over the keys in the database, which correspond to the endpoints that were previously retrieved.

For each key, the code prints the endpoint name, description, and the data retrieved from the API. The pprint function is used to print the JSON data in a more human-readable format, with indentation and line breaks that make it easier to read.

with sqldict("../dbs/linksapi.db") as db:
    for i, key in enumerate(db):
        h1(f"{i + 1}. {names[i]} ({key})")
        print(descriptions[i])
        print()
        pprint(db[key].json())
        print()

Outputs:

1. Anchor Text (anchor_text)
Use this endpoint to get data about anchor text used by followed external links to a target. Results are ordered by external_root_domains descending.

{'next_token': 'KIkQVg4s9ak8iRBWDiz1qTyguYswnj035n7bYI0Lc2VvbW96IGJsb2dKBcCodcl47Q==',
 'results': [{'anchor_text': 'moz',
              'external_pages': 7162,
              'external_root_domains': 2026},
             {'anchor_text': 'moz blog',
              'external_pages': 15525,
              'external_root_domains': 1364},
             {'anchor_text': 'the moz blog',
              'external_pages': 7879,
              'external_root_domains': 728},
             {'anchor_text': 'seomoz',
              'external_pages': 17741,
              'external_root_domains': 654},
             {'anchor_text': 'https://moz.com/blog',
              'external_pages': 978,
              'external_root_domains': 491}]}

2. Final Redirect (final_redirect)
Use this endpoint to get data about anchor text used by followed external links to a target. Results are ordered by external_root_domains descending.

{'page': 'moz.com/blog'}

3. Global Top Pages (global_top_pages)
This endpoint returns the top 500 pages in the entire index with the highest Page Authority values, sorted by Page Authority. (Visit the Top 500 Sites list to explore the top root domains on the web, sorted by Domain Authority.)

{'next_token': 'BcLbRwBmrXHK',
 'results': [{'deleted_pages_to_page': 11932076,
              'deleted_pages_to_root_domain': 23942663640,
              'deleted_pages_to_subdomain': 21555752652,
              'deleted_root_domains_to_page': 64700,
              'deleted_root_domains_to_root_domain': 3688228,
              'deleted_root_domains_to_subdomain': 3516235,
              'domain_authority': 96,
              'external_indirect_pages_to_root_domain': 5042652519,
              'external_nofollow_pages_to_page': 31163,
              'external_nofollow_pages_to_root_domain': 12375460748,
              'external_nofollow_pages_to_subdomain': 11393036086,
              'external_pages_to_page': 118102549,
              'external_pages_to_root_domain': 91362310623,
              'external_pages_to_subdomain': 83283626903,
              'external_redirect_pages_to_page': 0,
              'external_redirect_pages_to_root_domain': 445730476,
              'external_redirect_pages_to_subdomain': 432323198,
              'http_code': 5,
              'indirect_root_domains_to_page': 0,
              'indirect_root_domains_to_root_domain': 701121,
              'last_crawled': '2023-01-15',
              'link_propensity': 1.76710455e-05,
              'nofollow_pages_from_page': 0,
              'nofollow_pages_from_root_domain': 2,
              'nofollow_pages_to_page': 31163,
              'nofollow_pages_to_root_domain': 12375623717,
              'nofollow_pages_to_subdomain': 11393036179,
              'nofollow_root_domains_from_page': 0,
              'nofollow_root_domains_from_root_domain': 0,
              'nofollow_root_domains_to_page': 980,
              'nofollow_root_domains_to_root_domain': 3696150,
              'nofollow_root_domains_to_subdomain': 3622349,
              'page': 'www.facebook.com/Plesk',
              'page_authority': 100,
              'pages_crawled_from_root_domain': 1810872,
              'pages_from_page': 0,
              'pages_from_root_domain': 5289,
              'pages_to_page': 118102549,
              'pages_to_root_domain': 91368257043,
              'pages_to_subdomain': 83288001442,
              'redirect_pages_to_page': 0,
              'redirect_pages_to_root_domain': 447189164,
              'redirect_pages_to_subdomain': 433411292,
              'root_domain': 'facebook.com',
              'root_domains_from_page': 0,
              'root_domains_from_root_domain': 32,
              'root_domains_to_page': 491956,
              'root_domains_to_root_domain': 59416650,
              'root_domains_to_subdomain': 50993087,
              'spam_score': 1,
              'subdomain': 'www.facebook.com',
              'title': ''},
             {'deleted_pages_to_page': 5828966,
              'deleted_pages_to_root_domain': 79909678,
              'deleted_pages_to_subdomain': 79909678,
              'deleted_root_domains_to_page': 16552,
              'deleted_root_domains_to_root_domain': 98416,
              'deleted_root_domains_to_subdomain': 98416,
              'domain_authority': 94,
              'external_indirect_pages_to_root_domain': 1177381629,
              'external_nofollow_pages_to_page': 453328699,
              'external_nofollow_pages_to_root_domain': 1643990147,
              'external_nofollow_pages_to_subdomain': 1643990147,
              'external_pages_to_page': 456279611,
              'external_pages_to_root_domain': 2808523112,
              'external_pages_to_subdomain': 2808523112,
              'external_redirect_pages_to_page': 125,
              'external_redirect_pages_to_root_domain': 24941546,
              'external_redirect_pages_to_subdomain': 24941546,
              'http_code': 3,
              'indirect_root_domains_to_page': 723,
              'indirect_root_domains_to_root_domain': 252606,
              'last_crawled': '2023-01-14',
              'link_propensity': 0.118001014,
              'nofollow_pages_from_page': 0,
              'nofollow_pages_from_root_domain': 121166,
              'nofollow_pages_to_page': 453328699,
              'nofollow_pages_to_root_domain': 1644293277,
              'nofollow_pages_to_subdomain': 1644293277,
              'nofollow_root_domains_from_page': 0,
              'nofollow_root_domains_from_root_domain': 67627,
              'nofollow_root_domains_to_page': 9800973,
              'nofollow_root_domains_to_root_domain': 4959747,
              'nofollow_root_domains_to_subdomain': 4959747,
              'page': 'wordpress.com/?ref=footer_blog',
              'page_authority': 100,
              'pages_crawled_from_root_domain': 1731019,
              'pages_from_page': 0,
              'pages_from_root_domain': 1080338,
              'pages_to_page': 456293004,
              'pages_to_root_domain': 2817137385,
              'pages_to_subdomain': 2817137385,
              'redirect_pages_to_page': 125,
              'redirect_pages_to_root_domain': 25449067,
              'redirect_pages_to_subdomain': 25449067,
              'root_domain': 'wordpress.com',
              'root_domains_from_page': 0,
              'root_domains_from_root_domain': 204262,
              'root_domains_to_page': 9878742,
              'root_domains_to_root_domain': 12653294,
              'root_domains_to_subdomain': 12653294,
              'spam_score': 1,
              'subdomain': 'wordpress.com',
              'title': ''},
             {'deleted_pages_to_page': 3904778,
              'deleted_pages_to_root_domain': 23942663640,
              'deleted_pages_to_subdomain': 21555752652,
              'deleted_root_domains_to_page': 11671,
              'deleted_root_domains_to_root_domain': 3688228,
              'deleted_root_domains_to_subdomain': 3516235,
              'domain_authority': 96,
              'external_indirect_pages_to_root_domain': 5042652519,
              'external_nofollow_pages_to_page': 4449343,
              'external_nofollow_pages_to_root_domain': 12375460748,
              'external_nofollow_pages_to_subdomain': 11393036086,
              'external_pages_to_page': 59602588,
              'external_pages_to_root_domain': 91362310623,
              'external_pages_to_subdomain': 83283626903,
              'external_redirect_pages_to_page': 12625,
              'external_redirect_pages_to_root_domain': 445730476,
              'external_redirect_pages_to_subdomain': 432323198,
              'http_code': 5,
              'indirect_root_domains_to_page': 1632,
              'indirect_root_domains_to_root_domain': 701121,
              'last_crawled': '2023-01-16',
              'link_propensity': 1.76710455e-05,
              'nofollow_pages_from_page': 0,
              'nofollow_pages_from_root_domain': 2,
              'nofollow_pages_to_page': 4449343,
              'nofollow_pages_to_root_domain': 12375623717,
              'nofollow_pages_to_subdomain': 11393036179,
              'nofollow_root_domains_from_page': 0,
              'nofollow_root_domains_from_root_domain': 0,
              'nofollow_root_domains_to_page': 28624,
              'nofollow_root_domains_to_root_domain': 3696150,
              'nofollow_root_domains_to_subdomain': 3622349,
              'page': 'www.facebook.com/home.php',
              'page_authority': 100,
              'pages_crawled_from_root_domain': 1810872,
              'pages_from_page': 0,
              'pages_from_root_domain': 5289,
              'pages_to_page': 59602589,
              'pages_to_root_domain': 91368257043,
              'pages_to_subdomain': 83288001442,
              'redirect_pages_to_page': 12626,
              'redirect_pages_to_root_domain': 447189164,
              'redirect_pages_to_subdomain': 433411292,
              'root_domain': 'facebook.com',
              'root_domains_from_page': 0,
              'root_domains_from_root_domain': 32,
              'root_domains_to_page': 239697,
              'root_domains_to_root_domain': 59416650,
              'root_domains_to_subdomain': 50993087,
              'spam_score': 1,
              'subdomain': 'www.facebook.com',
              'title': ''},
             {'deleted_pages_to_page': 3440567,
              'deleted_pages_to_root_domain': 3440700,
              'deleted_pages_to_subdomain': 3440700,
              'deleted_root_domains_to_page': 60839,
              'deleted_root_domains_to_root_domain': 60840,
              'deleted_root_domains_to_subdomain': 60840,
              'domain_authority': 1,
              'external_indirect_pages_to_root_domain': 7,
              'external_nofollow_pages_to_page': 288,
              'external_nofollow_pages_to_root_domain': 1499,
              'external_nofollow_pages_to_subdomain': 1499,
              'external_pages_to_page': 140954613,
              'external_pages_to_root_domain': 140959216,
              'external_pages_to_subdomain': 140959213,
              'external_redirect_pages_to_page': 70,
              'external_redirect_pages_to_root_domain': 70,
              'external_redirect_pages_to_subdomain': 70,
              'http_code': 200,
              'indirect_root_domains_to_page': 0,
              'indirect_root_domains_to_root_domain': 0,
              'last_crawled': '2018-02-05',
              'link_propensity': 0.3998428881,
              'nofollow_pages_from_page': 12,
              'nofollow_pages_from_root_domain': 805,
              'nofollow_pages_to_page': 288,
              'nofollow_pages_to_root_domain': 10799,
              'nofollow_pages_to_subdomain': 10799,
              'nofollow_root_domains_from_page': 2,
              'nofollow_root_domains_from_root_domain': 7,
              'nofollow_root_domains_to_page': 30,
              'nofollow_root_domains_to_root_domain': 30,
              'nofollow_root_domains_to_subdomain': 30,
              'page': 'music.skyrock.com/',
              'page_authority': 100,
              'pages_crawled_from_root_domain': 2546,
              'pages_from_page': 61,
              'pages_from_root_domain': 3382,
              'pages_to_page': 140956009,
              'pages_to_root_domain': 141008586,
              'pages_to_subdomain': 141008583,
              'redirect_pages_to_page': 70,
              'redirect_pages_to_root_domain': 70,
              'redirect_pages_to_subdomain': 70,
              'root_domain': 'music.skyrock.com',
              'root_domains_from_page': 19,
              'root_domains_from_root_domain': 1018,
              'root_domains_to_page': 10609865,
              'root_domains_to_root_domain': 10609868,
              'root_domains_to_subdomain': 10609868,
              'spam_score': 9,
              'subdomain': 'music.skyrock.com',
              'title': 'Blog de Music - DES NEWS, DES CLIPS, DES INTERVIEWS - '
                       'Skyrock.com'},
             {'deleted_pages_to_page': 64159924,
              'deleted_pages_to_root_domain': 17641375891,
              'deleted_pages_to_subdomain': 336246205,
              'deleted_root_domains_to_page': 63574,
              'deleted_root_domains_to_root_domain': 1728606,
              'deleted_root_domains_to_subdomain': 234073,
              'domain_authority': 100,
              'external_indirect_pages_to_root_domain': 19281720347,
              'external_nofollow_pages_to_page': 34635431,
              'external_nofollow_pages_to_root_domain': 7885369442,
              'external_nofollow_pages_to_subdomain': 184067821,
              'external_pages_to_page': 285612569,
              'external_pages_to_root_domain': 55013651418,
              'external_pages_to_subdomain': 1492976347,
              'external_redirect_pages_to_page': 593282,
              'external_redirect_pages_to_root_domain': 250423075,
              'external_redirect_pages_to_subdomain': 5678006,
              'http_code': 302,
              'indirect_root_domains_to_page': 1072,
              'indirect_root_domains_to_root_domain': 231256,
              'last_crawled': '2023-04-01',
              'link_propensity': 0.006248265505,
              'nofollow_pages_from_page': 0,
              'nofollow_pages_from_root_domain': 991472,
              'nofollow_pages_to_page': 34635436,
              'nofollow_pages_to_root_domain': 7948674425,
              'nofollow_pages_to_subdomain': 184068512,
              'nofollow_root_domains_from_page': 0,
              'nofollow_root_domains_from_root_domain': 182393,
              'nofollow_root_domains_to_page': 126656,
              'nofollow_root_domains_to_root_domain': 2322389,
              'nofollow_root_domains_to_subdomain': 304381,
              'page': 'youtube.com/',
              'page_authority': 100,
              'pages_crawled_from_root_domain': 41258009,
              'pages_from_page': 0,
              'pages_from_root_domain': 11109186,
              'pages_to_page': 285612606,
              'pages_to_root_domain': 55255620288,
              'pages_to_subdomain': 1493073570,
              'redirect_pages_to_page': 593282,
              'redirect_pages_to_root_domain': 263224806,
              'redirect_pages_to_subdomain': 5678383,
              'root_domain': 'youtube.com',
              'root_domains_from_page': 0,
              'root_domains_from_root_domain': 257791,
              'root_domains_to_page': 598403,
              'root_domains_to_root_domain': 23134271,
              'root_domains_to_subdomain': 1927717,
              'spam_score': 4,
              'subdomain': 'youtube.com',
              'title': ''}]}

4. Global Top Root Domains (global_top_root_domains)
This endpoint returns the top 500 pages in the entire index with the highest Page Authority values, sorted by Page Authority. (Visit the Top 500 Sites list to explore the top root domains on the web, sorted by Domain Authority.)

{'next_token': 'BcLbRwBmrXHK',
 'results': [{'domain_authority': 100,
              'link_propensity': 0.006248265505,
              'root_domain': 'youtube.com',
              'root_domains_to_root_domain': 23134271,
              'spam_score': 4,
              'to_target': {'deleted_pages': 0,
                            'nofollow_pages': 0,
                            'pages': 0,
                            'redirect_pages': 0}},
             {'domain_authority': 100,
              'link_propensity': 0.008422264829,
              'root_domain': 'www.google.com',
              'root_domains_to_root_domain': 14723695,
              'spam_score': 14,
              'to_target': {'deleted_pages': 0,
                            'nofollow_pages': 0,
                            'pages': 0,
                            'redirect_pages': 0}},
             {'domain_authority': 100,
              'link_propensity': 0.0001607139566,
              'root_domain': 'www.blogger.com',
              'root_domains_to_root_domain': 30580427,
              'spam_score': -1,
              'to_target': {'deleted_pages': 0,
                            'nofollow_pages': 0,
                            'pages': 0,
                            'redirect_pages': 0}},
             {'domain_authority': 99,
              'link_propensity': 0.04834850505,
              'root_domain': 'linkedin.com',
              'root_domains_to_root_domain': 12339087,
              'spam_score': 1,
              'to_target': {'deleted_pages': 0,
                            'nofollow_pages': 0,
                            'pages': 0,
                            'redirect_pages': 0}},
             {'domain_authority': 99,
              'link_propensity': 0.006264935713,
              'root_domain': 'microsoft.com',
              'root_domains_to_root_domain': 5344181,
              'spam_score': 11,
              'to_target': {'deleted_pages': 0,
                            'nofollow_pages': 0,
                            'pages': 0,
                            'redirect_pages': 0}}]}

5. Index Metadata (index_metadata)
This endpoint returns the top 500 pages in the entire index with the highest Page Authority values, sorted by Page Authority. (Visit the Top 500 Sites list to explore the top root domains on the web, sorted by Domain Authority.)

{'index_id': 'NE+lX5bFh06baS9ojUwVbw==',
 'spam_score_update_days': ['2019-02-08',
                            '2020-03-28',
                            '2020-08-03',
                            '2020-11-13',
                            '2021-02-24',
                            '2021-05-19',
                            '2021-08-16',
                            '2021-11-02',
                            '2022-02-01',
                            '2022-05-10',
                            '2022-11-16']}

6. Link Intersect (link_intersect)
Use this endpoint to get sources that link to at least one of a list of positive targets and don't link to any of a list of negative targets.

{'next_token': 'AcmY2oCXQbbg',
 'results': [{'domain_authority': 100,
              'matching_target_indexes': [0],
              'page': 'www.google.com/amp/www.latimes.com/local/lanow/la-me-ln-aliso-viejo-shooting-20171012-story,amp.html',
              'spam_score': 14,
              'title': ''}]}

7. Link Status (link_status)
Use this endpoint to get information about links from many sources to a single target.

{'exists': [False, False]}

8. Linking Root Domains (linking_root_domains)
Use this endpoint to get linking root domains to a target.

{'next_token': 'IokQVg4s9ak8iRBWDiz1qTyguYswnj035qBkmE3DU+JTtwAVhsjH7R6XUA==',
 'results': [{'domain_authority': 99,
              'link_propensity': 0.006264935713,
              'root_domain': 'microsoft.com',
              'root_domains_to_root_domain': 5344181,
              'spam_score': 11,
              'to_target': {'deleted_pages': 0,
                            'nofollow_pages': 0,
                            'pages': 2,
                            'redirect_pages': 0}},
             {'domain_authority': 98,
              'link_propensity': 0.02977741137,
              'root_domain': 'wordpress.org',
              'root_domains_to_root_domain': 12250296,
              'spam_score': 2,
              'to_target': {'deleted_pages': 0,
                            'nofollow_pages': 2,
                            'pages': 2,
                            'redirect_pages': 0}},
             {'domain_authority': 96,
              'link_propensity': 0.09679271281,
              'root_domain': 'github.com',
              'root_domains_to_root_domain': 2948013,
              'spam_score': 2,
              'to_target': {'deleted_pages': 0,
                            'nofollow_pages': 12,
                            'pages': 12,
                            'redirect_pages': 0}},
             {'domain_authority': 96,
              'link_propensity': 0.004641198553,
              'root_domain': 'amazon.com',
              'root_domains_to_root_domain': 5023132,
              'spam_score': 28,
              'to_target': {'deleted_pages': 0,
                            'nofollow_pages': 0,
                            'pages': 2,
                            'redirect_pages': 0}},
             {'domain_authority': 95,
              'link_propensity': 0.005770479795,
              'root_domain': 'shopify.com',
              'root_domains_to_root_domain': 2948087,
              'spam_score': 1,
              'to_target': {'deleted_pages': 3,
                            'nofollow_pages': 0,
                            'pages': 0,
                            'redirect_pages': 0}}]}

9. Links (links)
Use this endpoint to get links to a target.

{'next_token': 'AVvpJ4gPPvOY',
 'results': [{'anchor_text': 'moz blog',
              'date_disappeared': '',
              'date_first_seen': '2020-06-29',
              'date_last_seen': '2023-01-14',
              'nofollow': True,
              'redirect': False,
              'rel_canonical': False,
              'source': {'deleted_pages_to_page': 570,
                         'deleted_pages_to_root_domain': 1251501128,
                         'deleted_pages_to_subdomain': 1182759912,
                         'deleted_root_domains_to_page': 34,
                         'deleted_root_domains_to_root_domain': 322790,
                         'deleted_root_domains_to_subdomain': 314554,
                         'domain_authority': 96,
                         'external_indirect_pages_to_root_domain': 863103308,
                         'external_nofollow_pages_to_page': 1407,
                         'external_nofollow_pages_to_root_domain': 667480081,
                         'external_nofollow_pages_to_subdomain': 650421076,
                         'external_pages_to_page': 3710,
                         'external_pages_to_root_domain': 5309615021,
                         'external_pages_to_subdomain': 5086141938,
                         'external_redirect_pages_to_page': 14,
                         'external_redirect_pages_to_root_domain': 143685025,
                         'external_redirect_pages_to_subdomain': 142061138,
                         'http_code': 200,
                         'indirect_root_domains_to_page': 2,
                         'indirect_root_domains_to_root_domain': 180014,
                         'last_crawled': '2023-01-14',
                         'link_propensity': 0.09679271281,
                         'nofollow_pages_from_page': 199,
                         'nofollow_pages_from_root_domain': 7541042,
                         'nofollow_pages_to_page': 1407,
                         'nofollow_pages_to_root_domain': 678014273,
                         'nofollow_pages_to_subdomain': 660443683,
                         'nofollow_root_domains_from_page': 93,
                         'nofollow_root_domains_from_root_domain': 564314,
                         'nofollow_root_domains_to_page': 58,
                         'nofollow_root_domains_to_root_domain': 186407,
                         'nofollow_root_domains_to_subdomain': 171632,
                         'page': 'github.com/mezod/awesome-indie',
                         'page_authority': 68,
                         'pages_crawled_from_root_domain': 7254823,
                         'pages_from_page': 202,
                         'pages_from_root_domain': 8613796,
                         'pages_to_page': 3746,
                         'pages_to_root_domain': 5628821927,
                         'pages_to_subdomain': 5352019489,
                         'redirect_pages_to_page': 14,
                         'redirect_pages_to_root_domain': 145613441,
                         'redirect_pages_to_subdomain': 142856036,
                         'root_domain': 'github.com',
                         'root_domains_from_page': 96,
                         'root_domains_from_root_domain': 702214,
                         'root_domains_to_page': 231,
                         'root_domains_to_root_domain': 2948013,
                         'root_domains_to_subdomain': 2857538,
                         'spam_score': 2,
                         'subdomain': 'github.com',
                         'title': 'GitHub - mezod/awesome-indie: Resources for '
                                  'independent developers to make money'},
              'target': {'deleted_pages_to_page': 169073,
                         'deleted_pages_to_root_domain': 19022927,
                         'deleted_pages_to_subdomain': 18554702,
                         'deleted_root_domains_to_page': 1457,
                         'deleted_root_domains_to_root_domain': 27522,
                         'deleted_root_domains_to_subdomain': 27273,
                         'domain_authority': 91,
                         'external_indirect_pages_to_root_domain': 45290099,
                         'external_nofollow_pages_to_page': 7388,
                         'external_nofollow_pages_to_root_domain': 17425478,
                         'external_nofollow_pages_to_subdomain': 17269297,
                         'external_pages_to_page': 553261,
                         'external_pages_to_root_domain': 69376449,
                         'external_pages_to_subdomain': 68746190,
                         'external_redirect_pages_to_page': 265,
                         'external_redirect_pages_to_root_domain': 41112725,
                         'external_redirect_pages_to_subdomain': 41109338,
                         'http_code': 200,
                         'indirect_root_domains_to_page': 2219,
                         'indirect_root_domains_to_root_domain': 28779,
                         'last_crawled': '2023-04-02',
                         'link_propensity': 0.008849279955,
                         'nofollow_pages_from_page': 0,
                         'nofollow_pages_from_root_domain': 209067,
                         'nofollow_pages_to_page': 7388,
                         'nofollow_pages_to_root_domain': 17442464,
                         'nofollow_pages_to_subdomain': 17285191,
                         'nofollow_root_domains_from_page': 0,
                         'nofollow_root_domains_from_root_domain': 55943,
                         'nofollow_root_domains_to_page': 1727,
                         'nofollow_root_domains_to_root_domain': 37789,
                         'nofollow_root_domains_to_subdomain': 37690,
                         'page': 'moz.com/blog',
                         'page_authority': 69,
                         'pages_crawled_from_root_domain': 7872618,
                         'pages_from_page': 7,
                         'pages_from_root_domain': 343751,
                         'pages_to_page': 906052,
                         'pages_to_root_domain': 98442581,
                         'pages_to_subdomain': 97352802,
                         'redirect_pages_to_page': 746,
                         'redirect_pages_to_root_domain': 47575576,
                         'redirect_pages_to_subdomain': 47570092,
                         'root_domain': 'moz.com',
                         'root_domains_from_page': 5,
                         'root_domains_from_root_domain': 69667,
                         'root_domains_to_page': 9712,
                         'root_domains_to_root_domain': 179884,
                         'root_domains_to_subdomain': 178649,
                         'spam_score': 1,
                         'subdomain': 'moz.com',
                         'title': 'The Moz Blog [SEO] - Moz'},
              'via_redirect': False,
              'via_rel_canonical': False}]}

10. Top Pages (top_pages)
This endpoint returns top pages on a target domain.

{'next_token': 'BXULGXd3IggK',
 'results': [{'deleted_pages_to_page': 1963527,
              'deleted_pages_to_root_domain': 19022927,
              'deleted_pages_to_subdomain': 18554702,
              'deleted_root_domains_to_page': 6527,
              'deleted_root_domains_to_root_domain': 27522,
              'deleted_root_domains_to_subdomain': 27273,
              'domain_authority': 91,
              'external_indirect_pages_to_root_domain': 45290099,
              'external_nofollow_pages_to_page': 9684724,
              'external_nofollow_pages_to_root_domain': 17425478,
              'external_nofollow_pages_to_subdomain': 17269297,
              'external_pages_to_page': 14981546,
              'external_pages_to_root_domain': 69376449,
              'external_pages_to_subdomain': 68746190,
              'external_redirect_pages_to_page': 3632556,
              'external_redirect_pages_to_root_domain': 41112725,
              'external_redirect_pages_to_subdomain': 41109338,
              'http_code': 200,
              'indirect_root_domains_to_page': 10580,
              'indirect_root_domains_to_root_domain': 28779,
              'last_crawled': '2023-04-01',
              'link_propensity': 0.008849279955,
              'nofollow_pages_from_page': 0,
              'nofollow_pages_from_root_domain': 209067,
              'nofollow_pages_to_page': 9684724,
              'nofollow_pages_to_root_domain': 17442464,
              'nofollow_pages_to_subdomain': 17285191,
              'nofollow_root_domains_from_page': 0,
              'nofollow_root_domains_from_root_domain': 55943,
              'nofollow_root_domains_to_page': 8749,
              'nofollow_root_domains_to_root_domain': 37789,
              'nofollow_root_domains_to_subdomain': 37690,
              'page': 'moz.com/',
              'page_authority': 74,
              'pages_crawled_from_root_domain': 7872618,
              'pages_from_page': 7,
              'pages_from_root_domain': 343751,
              'pages_to_page': 15343034,
              'pages_to_root_domain': 98442581,
              'pages_to_subdomain': 97352802,
              'redirect_pages_to_page': 3633007,
              'redirect_pages_to_root_domain': 47575576,
              'redirect_pages_to_subdomain': 47570092,
              'root_domain': 'moz.com',
              'root_domains_from_page': 5,
              'root_domains_from_root_domain': 69667,
              'root_domains_to_page': 41190,
              'root_domains_to_root_domain': 179884,
              'root_domains_to_subdomain': 178649,
              'spam_score': 1,
              'subdomain': 'moz.com',
              'title': 'Moz - SEO Software for Smarter Marketing'},
             {'deleted_pages_to_page': 185579,
              'deleted_pages_to_root_domain': 19022927,
              'deleted_pages_to_subdomain': 18554702,
              'deleted_root_domains_to_page': 2440,
              'deleted_root_domains_to_root_domain': 27522,
              'deleted_root_domains_to_subdomain': 27273,
              'domain_authority': 91,
              'external_indirect_pages_to_root_domain': 45290099,
              'external_nofollow_pages_to_page': 11211,
              'external_nofollow_pages_to_root_domain': 17425478,
              'external_nofollow_pages_to_subdomain': 17269297,
              'external_pages_to_page': 424268,
              'external_pages_to_root_domain': 69376449,
              'external_pages_to_subdomain': 68746190,
              'external_redirect_pages_to_page': 348,
              'external_redirect_pages_to_root_domain': 41112725,
              'external_redirect_pages_to_subdomain': 41109338,
              'http_code': 200,
              'indirect_root_domains_to_page': 1389,
              'indirect_root_domains_to_root_domain': 28779,
              'last_crawled': '2023-04-03',
              'link_propensity': 0.008849279955,
              'nofollow_pages_from_page': 0,
              'nofollow_pages_from_root_domain': 209067,
              'nofollow_pages_to_page': 11211,
              'nofollow_pages_to_root_domain': 17442464,
              'nofollow_pages_to_subdomain': 17285191,
              'nofollow_root_domains_from_page': 0,
              'nofollow_root_domains_from_root_domain': 55943,
              'nofollow_root_domains_to_page': 2487,
              'nofollow_root_domains_to_root_domain': 37789,
              'nofollow_root_domains_to_subdomain': 37690,
              'page': 'moz.com/beginners-guide-to-seo',
              'page_authority': 72,
              'pages_crawled_from_root_domain': 7872618,
              'pages_from_page': 7,
              'pages_from_root_domain': 343751,
              'pages_to_page': 786960,
              'pages_to_root_domain': 98442581,
              'pages_to_subdomain': 97352802,
              'redirect_pages_to_page': 365,
              'redirect_pages_to_root_domain': 47575576,
              'redirect_pages_to_subdomain': 47570092,
              'root_domain': 'moz.com',
              'root_domains_from_page': 5,
              'root_domains_from_root_domain': 69667,
              'root_domains_to_page': 15276,
              'root_domains_to_root_domain': 179884,
              'root_domains_to_subdomain': 178649,
              'spam_score': 1,
              'subdomain': 'moz.com',
              'title': "Beginner\'s Guide to SEO [plus FREE quick start "
                       'checklist] - Moz'},
             {'deleted_pages_to_page': 7159,
              'deleted_pages_to_root_domain': 19022927,
              'deleted_pages_to_subdomain': 18554702,
              'deleted_root_domains_to_page': 1382,
              'deleted_root_domains_to_root_domain': 27522,
              'deleted_root_domains_to_subdomain': 27273,
              'domain_authority': 91,
              'external_indirect_pages_to_root_domain': 45290099,
              'external_nofollow_pages_to_page': 8605,
              'external_nofollow_pages_to_root_domain': 17425478,
              'external_nofollow_pages_to_subdomain': 17269297,
              'external_pages_to_page': 34152,
              'external_pages_to_root_domain': 69376449,
              'external_pages_to_subdomain': 68746190,
              'external_redirect_pages_to_page': 70,
              'external_redirect_pages_to_root_domain': 41112725,
              'external_redirect_pages_to_subdomain': 41109338,
              'http_code': 200,
              'indirect_root_domains_to_page': 782,
              'indirect_root_domains_to_root_domain': 28779,
              'last_crawled': '2023-04-03',
              'link_propensity': 0.008849279955,
              'nofollow_pages_from_page': 0,
              'nofollow_pages_from_root_domain': 209067,
              'nofollow_pages_to_page': 8754,
              'nofollow_pages_to_root_domain': 17442464,
              'nofollow_pages_to_subdomain': 17285191,
              'nofollow_root_domains_from_page': 0,
              'nofollow_root_domains_from_root_domain': 55943,
              'nofollow_root_domains_to_page': 1380,
              'nofollow_root_domains_to_root_domain': 37789,
              'nofollow_root_domains_to_subdomain': 37690,
              'page': 'moz.com/google-algorithm-change',
              'page_authority': 70,
              'pages_crawled_from_root_domain': 7872618,
              'pages_from_page': 420,
              'pages_from_root_domain': 343751,
              'pages_to_page': 35181,
              'pages_to_root_domain': 98442581,
              'pages_to_subdomain': 97352802,
              'redirect_pages_to_page': 73,
              'redirect_pages_to_root_domain': 47575576,
              'redirect_pages_to_subdomain': 47570092,
              'root_domain': 'moz.com',
              'root_domains_from_page': 60,
              'root_domains_from_root_domain': 69667,
              'root_domains_to_page': 8881,
              'root_domains_to_root_domain': 179884,
              'root_domains_to_subdomain': 178649,
              'spam_score': 1,
              'subdomain': 'moz.com',
              'title': 'Moz - Google Algorithm Update History'},
             {'deleted_pages_to_page': 33133,
              'deleted_pages_to_root_domain': 19022927,
              'deleted_pages_to_subdomain': 18554702,
              'deleted_root_domains_to_page': 1192,
              'deleted_root_domains_to_root_domain': 27522,
              'deleted_root_domains_to_subdomain': 27273,
              'domain_authority': 91,
              'external_indirect_pages_to_root_domain': 45290099,
              'external_nofollow_pages_to_page': 31500,
              'external_nofollow_pages_to_root_domain': 17425478,
              'external_nofollow_pages_to_subdomain': 17269297,
              'external_pages_to_page': 70673,
              'external_pages_to_root_domain': 69376449,
              'external_pages_to_subdomain': 68746190,
              'external_redirect_pages_to_page': 77,
              'external_redirect_pages_to_root_domain': 41112725,
              'external_redirect_pages_to_subdomain': 41109338,
              'http_code': 301,
              'indirect_root_domains_to_page': 315,
              'indirect_root_domains_to_root_domain': 28779,
              'last_crawled': '2023-04-02',
              'link_propensity': 0.008849279955,
              'nofollow_pages_from_page': 0,
              'nofollow_pages_from_root_domain': 209067,
              'nofollow_pages_to_page': 31628,
              'nofollow_pages_to_root_domain': 17442464,
              'nofollow_pages_to_subdomain': 17285191,
              'nofollow_root_domains_from_page': 0,
              'nofollow_root_domains_from_root_domain': 55943,
              'nofollow_root_domains_to_page': 1689,
              'nofollow_root_domains_to_root_domain': 37789,
              'nofollow_root_domains_to_subdomain': 37690,
              'page': 'moz.com/researchtools/ose/',
              'page_authority': 70,
              'pages_crawled_from_root_domain': 7872618,
              'pages_from_page': 0,
              'pages_from_root_domain': 343751,
              'pages_to_page': 344305,
              'pages_to_root_domain': 98442581,
              'pages_to_subdomain': 97352802,
              'redirect_pages_to_page': 78,
              'redirect_pages_to_root_domain': 47575576,
              'redirect_pages_to_subdomain': 47570092,
              'root_domain': 'moz.com',
              'root_domains_from_page': 0,
              'root_domains_from_root_domain': 69667,
              'root_domains_to_page': 8086,
              'root_domains_to_root_domain': 179884,
              'root_domains_to_subdomain': 178649,
              'spam_score': 1,
              'subdomain': 'moz.com',
              'title': ''},
             {'deleted_pages_to_page': 169073,
              'deleted_pages_to_root_domain': 19022927,
              'deleted_pages_to_subdomain': 18554702,
              'deleted_root_domains_to_page': 1457,
              'deleted_root_domains_to_root_domain': 27522,
              'deleted_root_domains_to_subdomain': 27273,
              'domain_authority': 91,
              'external_indirect_pages_to_root_domain': 45290099,
              'external_nofollow_pages_to_page': 7388,
              'external_nofollow_pages_to_root_domain': 17425478,
              'external_nofollow_pages_to_subdomain': 17269297,
              'external_pages_to_page': 553261,
              'external_pages_to_root_domain': 69376449,
              'external_pages_to_subdomain': 68746190,
              'external_redirect_pages_to_page': 265,
              'external_redirect_pages_to_root_domain': 41112725,
              'external_redirect_pages_to_subdomain': 41109338,
              'http_code': 200,
              'indirect_root_domains_to_page': 2219,
              'indirect_root_domains_to_root_domain': 28779,
              'last_crawled': '2023-04-02',
              'link_propensity': 0.008849279955,
              'nofollow_pages_from_page': 0,
              'nofollow_pages_from_root_domain': 209067,
              'nofollow_pages_to_page': 7388,
              'nofollow_pages_to_root_domain': 17442464,
              'nofollow_pages_to_subdomain': 17285191,
              'nofollow_root_domains_from_page': 0,
              'nofollow_root_domains_from_root_domain': 55943,
              'nofollow_root_domains_to_page': 1727,
              'nofollow_root_domains_to_root_domain': 37789,
              'nofollow_root_domains_to_subdomain': 37690,
              'page': 'moz.com/blog',
              'page_authority': 69,
              'pages_crawled_from_root_domain': 7872618,
              'pages_from_page': 7,
              'pages_from_root_domain': 343751,
              'pages_to_page': 906052,
              'pages_to_root_domain': 98442581,
              'pages_to_subdomain': 97352802,
              'redirect_pages_to_page': 746,
              'redirect_pages_to_root_domain': 47575576,
              'redirect_pages_to_subdomain': 47570092,
              'root_domain': 'moz.com',
              'root_domains_from_page': 5,
              'root_domains_from_root_domain': 69667,
              'root_domains_to_page': 9712,
              'root_domains_to_root_domain': 179884,
              'root_domains_to_subdomain': 178649,
              'spam_score': 1,
              'subdomain': 'moz.com',
              'title': 'The Moz Blog [SEO] - Moz'}]}

11. URL Metrics (url_metrics)
Use this endpoint to get metrics about one or more urls.

{'results': [{'deleted_pages_to_page': 1963527,
              'deleted_pages_to_root_domain': 19022927,
              'deleted_pages_to_subdomain': 18554702,
              'deleted_root_domains_to_page': 6527,
              'deleted_root_domains_to_root_domain': 27522,
              'deleted_root_domains_to_subdomain': 27273,
              'domain_authority': 91,
              'external_indirect_pages_to_root_domain': 45290099,
              'external_nofollow_pages_to_page': 9684724,
              'external_nofollow_pages_to_root_domain': 17425478,
              'external_nofollow_pages_to_subdomain': 17269297,
              'external_pages_to_page': 14981546,
              'external_pages_to_root_domain': 69376449,
              'external_pages_to_subdomain': 68746190,
              'external_redirect_pages_to_page': 3632556,
              'external_redirect_pages_to_root_domain': 41112725,
              'external_redirect_pages_to_subdomain': 41109338,
              'http_code': 200,
              'indirect_root_domains_to_page': 10580,
              'indirect_root_domains_to_root_domain': 28779,
              'last_crawled': '2023-04-01',
              'link_propensity': 0.008849279955,
              'nofollow_pages_from_page': 0,
              'nofollow_pages_from_root_domain': 209067,
              'nofollow_pages_to_page': 9684724,
              'nofollow_pages_to_root_domain': 17442464,
              'nofollow_pages_to_subdomain': 17285191,
              'nofollow_root_domains_from_page': 0,
              'nofollow_root_domains_from_root_domain': 55943,
              'nofollow_root_domains_to_page': 8749,
              'nofollow_root_domains_to_root_domain': 37789,
              'nofollow_root_domains_to_subdomain': 37690,
              'page': 'moz.com/',
              'page_authority': 74,
              'pages_crawled_from_root_domain': 7872618,
              'pages_from_page': 7,
              'pages_from_root_domain': 343751,
              'pages_to_page': 15343034,
              'pages_to_root_domain': 98442581,
              'pages_to_subdomain': 97352802,
              'redirect_pages_to_page': 3633007,
              'redirect_pages_to_root_domain': 47575576,
              'redirect_pages_to_subdomain': 47570092,
              'root_domain': 'moz.com',
              'root_domains_from_page': 5,
              'root_domains_from_root_domain': 69667,
              'root_domains_to_page': 41190,
              'root_domains_to_root_domain': 179884,
              'root_domains_to_subdomain': 178649,
              'spam_score': 1,
              'subdomain': 'moz.com',
              'title': 'Moz - SEO Software for Smarter Marketing'},
             {'deleted_pages_to_page': 249094,
              'deleted_pages_to_root_domain': 224212706,
              'deleted_pages_to_subdomain': 898844,
              'deleted_root_domains_to_page': 3696,
              'deleted_root_domains_to_root_domain': 177001,
              'deleted_root_domains_to_subdomain': 9251,
              'domain_authority': 95,
              'external_indirect_pages_to_root_domain': 156562794,
              'external_nofollow_pages_to_page': 163849,
              'external_nofollow_pages_to_root_domain': 72093550,
              'external_nofollow_pages_to_subdomain': 294697,
              'external_pages_to_page': 1165187,
              'external_pages_to_root_domain': 514661963,
              'external_pages_to_subdomain': 2310818,
              'external_redirect_pages_to_page': 3049,
              'external_redirect_pages_to_root_domain': 4827448,
              'external_redirect_pages_to_subdomain': 8140,
              'http_code': 301,
              'indirect_root_domains_to_page': 1439,
              'indirect_root_domains_to_root_domain': 30315,
              'last_crawled': '2023-03-31',
              'link_propensity': 0.02704063244,
              'nofollow_pages_from_page': 0,
              'nofollow_pages_from_root_domain': 97163,
              'nofollow_pages_to_page': 163881,
              'nofollow_pages_to_root_domain': 72644206,
              'nofollow_pages_to_subdomain': 294765,
              'nofollow_root_domains_from_page': 0,
              'nofollow_root_domains_from_root_domain': 22711,
              'nofollow_root_domains_to_page': 5647,
              'nofollow_root_domains_to_root_domain': 178651,
              'nofollow_root_domains_to_subdomain': 11590,
              'page': 'nytimes.com/',
              'page_authority': 82,
              'pages_crawled_from_root_domain': 13567138,
              'pages_from_page': 0,
              'pages_from_root_domain': 3152122,
              'pages_to_page': 1170498,
              'pages_to_root_domain': 763781494,
              'pages_to_subdomain': 2489707,
              'redirect_pages_to_page': 3053,
              'redirect_pages_to_root_domain': 9268395,
              'redirect_pages_to_subdomain': 14273,
              'root_domain': 'nytimes.com',
              'root_domains_from_page': 0,
              'root_domains_from_root_domain': 366864,
              'root_domains_to_page': 25307,
              'root_domains_to_root_domain': 2200598,
              'root_domains_to_subdomain': 62699,
              'spam_score': 1,
              'subdomain': 'nytimes.com',
              'title': ''}]}

12. Usage Data (usage_data)
This endpoint Returns the number of rows consumed so far in the current billing period. The count returned might not reflect rows consumed in the last hour. The count returned reflects rows consumed by requests to both the v1 (Moz Links API) and v2 Links APIs.

{'rows_consumed': 254}

Friday, June 2, 2023

Easy to Implement Tactics for Local Link Building — Whiteboard Friday

Google’s local algorithm demands different tactics for link building. And even if you don't need local SEO, local link building strategies can still give you a different perspective and improve your work. In today’s episode, Greg walks you through some of these strategies.

Easy to Implement Tactics for Local Link Building

Click on the whiteboard image above to open a high resolution version in a new tab!

Video Transcription

♪ [music] ♪ Howdy, Moz fans. I'm Greg Gifford, the COO of SearchLab Digital, and I'm back to do another Whiteboard Friday. Today we're talking tactics for local links. Get it, huh? The important thing to remember is that Google's local algorithm is different, so you need to approach local link building differently.

Even if you don't need local SEO, local link-building tactics can still give you a different perspective, and we all know that with a different perspective, you can do better work. Whoa. So the most important thing to understand is that the best links are based on relationships that you have in the real world. They're easier to get and they're more powerful.

The other thing to remember is the easiest way to get local links is to turn the clock back and do the things that businesses used to do to get exposure in the local community that everybody kind of stopped doing once we had the internet. So if you just get involved in the local community, you'll naturally acquire a lot of amazing local links. Remember checklist link building rarely works.

That's not what I wrote, but I wasn't watching. But yeah, checklist link building doesn't really work. You have to be original. You need unique links to win so that you've got links that your competitors don't have. So don't follow a checklist. Think outside the box. Be creative.

So a couple of ideas and tactics that I want to run you through. Sponsorships are great. Now, also I should remind you Google is basically pattern detection. So most importantly, remember that when you look at your link profile, you can't just do one or two things. You have to have a natural mix. But with sponsorships, a lot of people avoid them because they think that you're really buying links because obviously buying links is bad.

But Google is totally okay with you buying a sponsorship that results in a link. So Little League, peewee hockey, peewee football, 5Ks, golf tournaments, these are all great things that a lot of businesses are already doing. So find those easy, cheap, awesome local sponsorship opportunities. Knock those out.

Charities are great too. So you're not sponsoring an event, you're giving money or donating time to a local charity. That's an awesome link opportunity as well. Volunteer opportunities are great as well. You know, you're taking your team down to feed the homeless at the soup kitchen or anything like that, or highway clean-ups. Things like that are awesome local link opportunities. Local meetups are one of my favorite things to do.

So you want to go to a site like meetup.com, and there's a couple of different tactics you can use here. First of all, if you've got some sort of a meeting room or a board room, conference room kind of situation and you're not using it all the time, go on meetup.com, look for local groups that have meetings on a regular basis. Let's say the group has a meeting on the second Monday of every month at 7:00 p.m. but your office is closed at 6:00.

So you could offer them your conference room or your meeting area. They've got free Wi-Fi, they've got an awesome big TV they can connect to and etc., and cool, boom, you get a local link out of it. Now, if you don't have a space that you can offer to those local groups, again get on meetup.com or Facebook groups or whatever and look for local groups that are looking for meeting sponsors. Forty, fifty, sixty bucks a month buys their soft drinks and their snacks, and killer local links come right your way.

Blogs are great too. Find the local bloggers and get them to write about you. Now, obviously, the blog is going to say, "Hey, I was given a free widget to write about Greg's widget company," but who cares? You still get an awesome link from a local blogger, and even if it's something that seems like it may not be related to what you do, it's still okay because that blogger is in the local area.

So even if it's a food blogger or a travel blogger and you give them a free widget to talk about your widget company, doesn't matter. You still get an awesome local link. Local business associations, it's a no-brainer. Better Business Bureau is an amazing one. In fact, Google has just added the Better Business Bureau link into your verification process if you're trying to get reinstated in your Google Business profile.

But that's a whole nother Whiteboard Friday that we'll cover later. But join all of those local business associations. They're a no-brainer. Local business directories, couldn't fit that in there, but local business directories are great as well. Join all of them that you can. Even if it's a little bit of a fee to join that business directory, it's still a really killer list of really relevant local links. Local calendar pages are another thing that a lot of businesses don't really think about.

There are always different websites or organizations in the local area, and a lot of times it's the city government site or local newspaper or TV that has a calendar page of just community events and you can't get on there if you don't have something event worthy. But if you've got a sales event or a barbecue or some sort of a cookout like that, you get a local link back to your site.

You also want to think outside the box. Don't do the same thing that everybody else does. You don't do the same SEO for every client, so you shouldn't do the same link building for every client. So a couple of outside-the-box things. If you work with car dealerships or pediatricians or even personal injury attorneys, child seat installation is a killer tactic.

You can have the staff of that location go to a two-day safety course. You can sign up at cert.safekids.org. I didn't have room to get it all on there. But you sit through the safety class and then you are officially certified to be a place that anyone in the community can come for free to get their child seat installed correctly.

Because guess what, 99% of us are installing our child seats incorrectly. So if there was an accident, our kids would get injured. So this really makes sense if it's something related to kids or if it's something like personal injury or doctors, really killer. If the business owner or leadership of the business is a member of a certain ethnic group, there are ethnic business directories in every city in the country.

So let's say you have an Iranian dry cleaner. I'm pulling that out of the air. There will be a list of Iranian-owned businesses in that area. You can get on that list, and the competitors can't. It's killer. Or if someone is LGBTQIA+, that's a link you can get that again competitors can't get, and there are LGBTQIA+ directories in every city that you can get on.

So go look for those if it's applicable to your business. This last one is a really kind of wacky one that people don't really understand the first time I mention it. This isn't about the clubs and organizations that your business is a member of. You want to talk to the staff at your client's business or at your business if you're in-house, especially leadership of the organization, and find out what they're passionate about.

What do they do in their free time that they really enjoy doing? What are their hobbies? Because if they're a member of a local club or organization, especially if they're so involved in that club or organization that they're on the leadership board of that organization, guess what, it's going to be super easy to get your business linked from that business' website.

This one here, a lot of people are going to laugh at this because it's an old-school technique, but it really works. If you periodically, I'm not saying every week, but periodically you write a local informational blog post of like, "Hey, our staff loves to go out and grab barbecue every Friday and here are the five best barbecue places in Dallas, according to our staff," once you've got that blog post up, you can then do outreach to each of those five locations and say, "Hey, we listed you as one of the five best places in Dallas."

Even if it's not related to the business, this is the kind of stuff that shows up in search results. So you get surfaced and you get eyeballs on your business. So it's great for branding. It's kind of that billboard effect, and it gets killer links. The really awesome part that people don't really think about is most of the time, when you're getting these links, you're dealing with people that aren't that technically savvy.

So yeah, sure, the people that are technically savvy are going to link to that specific local blog post. But the people that aren't, they're going to link to your homepage. So it's really killer. So a cool story because everybody that knows me knows I'm a story guy. Several years ago, I was speaking at a conference in Vegas and a lawyer came up and he said, "Hey, man, I know a lot about SEO and I need your help."

I said, "Okay, let me help you." He said, "I've got three times as many links as this other attorney in town, yet he outranks me on every single query possible." I said, "Okay." The guy again said, "I know SEO, so I should be winning." Well, obviously there's a lot more than just links at play. So what I did, I built a little spreadsheet and I graphed out the link profile of this guy versus his competitor based on Domain Authority.

Stick with me, though. It doesn't matter how many links they had, it's just based on the authority. So we see here the guy in red is the guy that I was talking to. So the guy in red has three times as many links as the guy in blue. So when I graph it out, this one right here is Domain Authority of above 50. This one right here is Domain Authority of 11 to 50, and this one right here is Domain Authority below 10.

You can see 67% of the guy in blue links are above a 50, and 53% or almost 60% of the guy in red who was the guy asking me for help was below 50. I said, "All right, there's a story here. Let's break it down a little bit more." So this is 91 to 100. This is 71 to 90, 51 to 70, 26 to 50, and 0 to 25.

You can see right here that almost a massive section, like 65% of the guy in blue links are above Domain Authority of 7 versus 60% basically of the guy in red that is really low, below a DA of 50. Now, the guy in red wasn't doing local SEO.

Even though the guy in red had three times as many links, they all skewed towards the bottom of the authority range because he was getting really horrible links from like seodirectory.com and linksxyz.com, and stupid things like that that wouldn't matter. The important thing to understand is if the guy in red had been doing local SEO and getting local links, this graph would look the same way because local links skew towards lower authority.

But the guy in red would have been destroying the guy in blue if he'd been getting local links. So that's the way that you need to change your perspective and think about it differently. If you'd like to do something similar yourself, you can do exactly the same thing with my Badass Link Worksheet. You can download it right here at bit.ly/gregs-bad-ass-link-sheet. That is all lowercase when you type it in.

So make sure you do that, or I'm sure they'll drop it down in the blog post or in the comments so that you can click on it there. You can download that sheet. It's an Excel spreadsheet. It does have macros enabled to basically make it easy to clear information out. It's set to work with your Moz link export. You need last month's link export, this month's link export, and your competitor's link export for this month.

You drop those in, automatically it's going to graph out all of this stuff. On another tab, it's going to give you a list of all of the link gap opportunities where your competitors have links and you don't. Then on another tab, it's going to give you a list of all the links that you've lost since the last time you did it. Now, yeah, there's some link tools out there that do it too, but this is a really easy tool.

It's super killer, and I'm sharing it with you guys today for free. Thank you so much for watching my newest episode of Whiteboard Friday.

Video transcription by Speechpad.com

Wednesday, May 31, 2023

How Social Media Can Supercharge Your SEO

When working in social media, it can feel like you exist worlds away from SEO. And as an SEO, social media may feel like something that isn’t quite relevant in your day to day. But as with all things marketing, both of these digital marketing tactics have the potential to boost collective success. As a Social Media Manager, I’m here to tell you how you as an SEO can collaborate with your social media team in order to help supercharge your SEO efforts.

What is a social media strategy?

A social media strategy is a document that outlines your organization’s social media goals, along with how you will achieve them, both through top-level strategy and on-the-ground tactics (i.e., what you actually do). A strategy is the foundation of how your organization approaches being on social media.

Social media vs. search engine optimization

Social media involves owning accounts and having an active presence on social media channels like Twitter, Instagram, Facebook, LinkedIn, TikTok, and YouTube, with the goal of driving brand awareness and engagement, or increasing traffic and conversions. On the other hand, search engine optimization (SEO) is a set of practices designed to improve the appearance and positioning of web pages in organic search results, resulting in increased website traffic and exposure to your brand.

Do links from social media improve your SEO?

Links from popular social media platforms such as Facebook are "no-follow" links, meaning they do not send link authority directly to your site. PageRank is Google's algorithm that ranks web pages based on the quantity and quality of external backlinks. However, gaining no-follow links from high-quality domains is still extremely important.

In the past, marketers ignored “no-follow” links, as they did not have any impact on organic ranking, but the “no-follow” attribute isn’t completely useless. A well-balanced backlink profile consisting of both followed, and no-followed links will appear more natural to Google and other search engines.

Another benefit of "no-follow" links is the referral traffic that they can provide. Although search engines will not follow links with the attached HTML "no-follow" attribute, users can click them to reach your site, giving you more traffic!

While no-follow links do not provide the same boost to your site's backlink profile as followed links, Google still likes to see them as a part of your site's backlink profile, and they offer a valuable source of referral traffic.

The SEO benefits of increased brand awareness

The primary SEO benefit of brand awareness that your social media strategy can drive is the boost you can see in "branded" organic search volume and clicks.

Not every user encountering your brand on their Instagram or TikTok feed will click through to your site — in fact, most won’t. Most people will mentally file away your brand name and products only to perform a Google search for your company name or products after the fact, i.e. a branded search. This is especially true if your social messaging is solid and memorable.

For many sites, especially newer ones, a branded search can represent a large portion of your organic traffic.

5 ways social media can improve your SEO

There are five ways that a robust social media presence can help improve your SEO:

Amplify website content through social channels to reach new audiences

Your website content may be great, but you need to drive eyes to it somehow! Sharing your content, like blogs or guides, on social media is a win-win-win:

  • You're building positive brand sentiment by providing content that answers people's questions.

  • You're driving more users to your website.

  • The positive response toward your content on social media sends signals to the social algorithms and therefore often shows it to new people.

One way we do this at Moz is with this very blog! Anything the Moz Blog publishes is promoted on our social media channels, which not only drives traffic but puts valuable content right in front of our audience for them to get immediate insights from.

Create and share infographics in social posts and blog articles

In my experience, people love nothing more on social media than a classic infographic. Sharing information in bite-sized, colorful, and visually appealing ways will result in shares, engagement, and traffic to your website. Plus, they're versatile — include them in your blogs, and you can use them on your social media posts! Every Whiteboard Friday episode that we publish here at Moz gets its own accompanying infographic. This is a great way to resurface a well-loved episode, and give people more value up front.

Build relationships with customers

One of the core tenets of social media is that it's a two-way street. As you get started, you as a brand need to provide valuable content to your audience without asking them for anything in return. Once you've cultivated goodwill with your audience, you now have a relationship in which you provide value, build that favorable currency, and then you're able to cash in on it in exchange for traffic or follow-throughs on your CTAs.

While our social media philosophy is that everything we put on social media has some form of value to our audience, we also make it a point to create content that doesn’t explicitly ask for anything, like clicking links or purchasing our product. Sometimes that’s providing them with information, and sometimes that can look like making them laugh.

Optimize your profiles on social channels and lead audiences toward your website

A simple but effective way to lead audiences to your website is to make it easy to get to! Ensure you optimize your social channels and keep a link to your website in each profile. If you need to house multiple links, use a “link in bio” service, but always make sure a quick shortcut to your website stays front and center.

This strategy is something we use on our Instagram. Instead of constantly changing the link based on what we’re promoting that day or just wasting the opportunity the link in bio provides, we have a link in bio tool through Sprout Social that lets us showcase all the links that are tied to each of our posts.

Target users who are more likely to convert to your site. Conversion and engagement metrics are great for SEO!

With social media, you should always know who you're trying to reach and how you're going to do so. One audience you should target on social media is people you know are ready to convert. Have different posts for different audiences as a part of your content mix, and include more mature leads further down the funnel. These become easy wins because they convert and engage once they hit the website, which is helpful for SEO metrics.

We know that the majority of people are coming to Moz for beginner SEO education, so we make it a point to really highlight those resources, such as our Beginner’s Guide to SEO or our How to Rank Checklist, knowing they will always see a lot of traffic and engagement.

Build relationships between your social media and SEO teams

A strong relationship between your social media and SEO teams is crucial. You can trade information about high-performing topics that can inform strategy on both sides or allow you to make reactive changes to your tactics based on opportunities. Schedule a monthly one-on-one with your respective counterpart in your organization to connect and fill each other in on pertinent information.

With this information, you’re now armed to go out and make this happen for yourself! Take this as an opportunity to connect with your social media team and find new and innovative ways to collaborate and drive results for both social media and SEO.

Monday, May 29, 2023

The Moz Links API: An Introduction

What exactly IS an API? They’re those things that you copy and paste long strange codes into Screaming Frog for links data on a Site Crawl, right?

I’m here to tell you there’s so much more to them than that – if you’re willing to take just a few little steps. But first, some basics.

What’s an API?

API stands for “application programming interface”, and it’s just the way of… using a thing. Everything has an API. The web is a giant API that takes URLs as input and returns pages.

But special data services like the Moz Links API have their own set of rules. These rules vary from service to service and can be a major stumbling block for people taking the next step.

When Screaming Frog gives you the extra links columns in a crawl, it’s using the Moz Links API, but you can have this capability anywhere. For example, all that tedious manual stuff you do in spreadsheet environments can be automated from data-pull to formatting and emailing a report.

If you take this next step, you can be more efficient than your competitors, designing and delivering your own SEO services instead of relying upon, paying for, and being limited by the next proprietary product integration.

GET vs. POST

Most APIs you’ll encounter use the same data transport mechanism as the web. That means there’s a URL involved just like a website. Don’t get scared! It’s easier than you think. In many ways, using an API is just like using a website.

As with loading web pages, the request may be in one of two places: the URL itself, or in the body of the request. The URL is called the “endpoint” and the often invisibly submitted extra part of the request is called the “payload” or “data”. When the data is in the URL, it’s called a “query string” and indicates the “GET” method is used. You see this all the time when you search:

https://www.google.com/search?q=moz+links+api <-- GET method 

When the data of the request is hidden, it’s called a “POST” request. You see this when you submit a form on the web and the submitted data does not show on the URL. When you hit the back button after such a POST, browsers usually warn you against double-submits. The reason the POST method is often used is that you can fit a lot more in the request using the POST method than the GET method. URLs would get very long otherwise. The Moz Links API uses the POST method.

Making requests

A web browser is what traditionally makes requests of websites for web pages. The browser is a type of software known as a client. Clients are what make requests of services. More than just browsers can make requests. The ability to make client web requests is often built into programming languages like Python, or can be broken out as a standalone tool. The most popular tools for making requests outside a browser are curl and wget.

We are discussing Python here. Python has a built-in library called URLLIB, but it’s designed to handle so many different types of requests that it’s a bit of a pain to use. There are other libraries that are more specialized for making requests of APIs. The most popular for Python is called requests. It’s so popular that it’s used for almost every Python API tutorial you’ll find on the web. So I will use it too. This is what “hitting” the Moz Links API looks like:

response = requests.post(endpoint, data=json_string, auth=auth_tuple)

Given that everything was set up correctly (more on that soon), this will produce the following output:

{'next_token': 'JYkQVg4s9ak8iRBWDiz1qTyguYswnj035nqrQ1oIbW96IGJsb2dZgGzDeAM7Rw==',
 'results': [{'anchor_text': 'moz',
              'external_pages': 7162,
              'external_root_domains': 2026}]}

This is JSON data. It's contained within the response object that was returned from the API. It’s not on the drive or in a file. It’s in memory. So long as it’s in memory, you can do stuff with it (often just saving it to a file).

If you wanted to grab a piece of data within such a response, you could refer to it like this:

response['results'][0]['external_pages']

This says: “Give me the first item in the results list, and then give me the external_pages value from that item.” The result would be 7162.

NOTE: If you’re actually following along executing code, the above line won’t work alone. There’s a certain amount of setup we’ll do shortly, including installing the requests library and setting up a few variables. But this is the basic idea.

JSON

JSON stands for JavaScript Object Notation. It’s a way of representing data in a way that’s easy for humans to read and write. It’s also easy for computers to read and write. It’s a very common data format for APIs that has somewhat taken over the world since the older ways were too difficult for most people to use. Some people might call this part of the “restful” API movement, but the much more difficult XML format is also considered “restful” and everyone seems to have their own interpretation. Consequently, I find it best to just focus on JSON and how it gets in and out of Python.

Python dictionaries

I lied to you. I said that the data structure you were looking at above was JSON. Technically it’s really a Python dictionary or dict datatype object. It’s a special kind of object in Python that’s designed to hold key/value pairs. The keys are strings and the values can be any type of object. The keys are like the column names in a spreadsheet. The values are like the cells in the spreadsheet. In this way, you can think of a Python dict as a JSON object. For example here’s creating a dict in Python:

my_dict = {
    "name": "Mike",
    "age": 52,
    "city": "New York"
}

And here is the equivalent in JavaScript:

var my_json = {
    "name": "Mike",
    "age": 52,
    "city": "New York"
}

Pretty much the same thing, right? Look closely. Key-names and string values get double-quotes. Numbers don’t. These rules apply consistently between JSON and Python dicts. So as you might imagine, it’s easy for JSON data to flow in and out of Python. This is a great gift that has made modern API-work highly accessible to the beginner through a tool that has revolutionized the field of data science and is making inroads into marketing, Jupyter Notebooks.

Flattening data

But beware! As data flows between systems, it’s not uncommon for the data to subtly change. For example, the JSON data above might be converted to a string. Strings might look exactly like JSON, but they’re not. They’re just a bunch of characters. Sometimes you’ll hear it called “serializing”, or “flattening”. It’s a subtle point, but worth understanding as it will help with one of the largest stumbling blocks with the Moz Links (and most JSON) APIs.

Objects have APIs

Actual JSON or dict objects have their own little APIs for accessing the data inside of them. The ability to use these JSON and dict APIs goes away when the data is flattened into a string, but it will travel between systems more easily, and when it arrives at the other end, it will be “deserialized” and the API will come back on the other system.

Data flowing between systems

This is the concept of portable, interoperable data. Back when it was called Electronic Data Interchange (or EDI), it was a very big deal. Then along came the web and then XML and then JSON and now it’s just a normal part of doing business.

If you’re in Python and you want to convert a dict to a flattened JSON string, you do the following:

import json

my_dict = {
    "name": "Mike",
    "age": 52,
    "city": "New York"
}

json_string = json.dumps(my_dict)

…which would produce the following output:

'{"name": "Mike", "age": 52, "city": "New York"}'

This looks almost the same as the original dict, but if you look closely you can see that single-quotes are used around the entire thing. Another obvious difference is that you can line-wrap real structured data for readability without any ill effect. You can't do it so easily with strings. That’s why it’s presented all on one line in the above snippet.

Such stringifying processes are done when passing data between different systems because they are not always compatible. Normal text strings on the other hand are compatible with almost everything and can be passed on web-requests with ease. Such flattened strings of JSON data are frequently referred to as the request.

Anatomy of a request

Again, here’s the example request we made above:

response = requests.post(endpoint, data=json_string, auth=auth_tuple)

Now that you understand what the variable name json_string is telling you about its contents, you shouldn’t be surprised to see this is how we populate that variable:

 data_dict = {
    "target": "moz.com/blog",
    "scope": "page",
    "limit": 1
}

json_string = json.dumps(data_dict)

…and the contents of json_string looks like this:

'{"target": "moz.com/blog", "scope": "page", "limit": 1}'

This is one of my key discoveries in learning the Moz Links API. This is in common with countless other APIs out there but trips me up every time because it’s so much more convenient to work with structured dicts than flattened strings. However, most APIs expect the data to be a string for portability between systems, so we have to convert it at the last moment before the actual API-call occurs.

Pythonic loads and dumps

Now you may be wondering in that above example, what a dump is doing in the middle of the code. The json.dumps() function is called a “dumper” because it takes a Python object and dumps it into a string. The json.loads() function is called a “loader” because it takes a string and loads it into a Python object.

The reason for what appear to be singular and plural options are actually binary and string options. If your data is binary, you use json.load() and json.dump(). If your data is a string, you use json.loads() and json.dumps(). The s stands for string. Leaving the s off means binary.

Don’t let anybody tell you Python is perfect. It’s just that its rough edges are not excessively objectionable.

Assignment vs. equality

For those of you completely new to Python or programming in general, what we’re doing when we hit the API is called an assignment. The result of requests.post() is being assigned to the variable named response.

response = requests.post(endpoint, data=json_string, auth=auth_tuple)

We are using the = sign to assign the value of the right side of the equation to the variable on the left side of the equation. The variable response is now a reference to the object that was returned from the API. Assignment is different from equality. The == sign is used for equality.

# This is assignment:
a = 1  # a is now equal to 1

# This is equality:
a == 1  # True, but relies that the above line has been executed

The POST method

response = requests.post(endpoint, data=json_string, auth=auth_tuple)

The requests library has a function called post() that takes 3 arguments. The first argument is the URL of the endpoint. The second argument is the data to send to the endpoint. The third argument is the authentication information to send to the endpoint.

Keyword parameters and their arguments

You may notice that some of the arguments to the post() function have names. Names are set equal to values using the = sign. Here’s how Python functions get defined. The first argument is positional both because it comes first and also because there’s no keyword. Keyworded arguments come after position-dependent arguments. Trust me, it all makes sense after a while. We all start to think like Guido van Rossum.

def arbitrary_function(argument1, name=argument2):
    # do stuff

The name in the above example is called a “keyword” and the values that come in on those locations are called “arguments”. Now arguments are assigned to variable names right in the function definition, so you can refer to either argument1 or argument2 anywhere inside this function. If you’d like to learn more about the rules of Python functions, you can read about them here.

Setting up the request

Okay, so let’s let you do everything necessary for that success assured moment. We’ve been showing the basic request:

response = requests.post(endpoint, data=json_string, auth=auth_tuple)

…but we haven’t shown everything that goes into it. Let’s do that now. If you’re following along and don’t have the requests library installed, you can do so with the following command from the same terminal environment from which you run Python:

pip install requests

Often times Jupyter will have the requests library installed already, but in case it doesn’t, you can install it with the following command from inside a Notebook cell:

!pip install requests

And now we can put it all together. There’s only a few things here that are new. The most important is how we’re taking 2 different variables and combining them into a single variable called AUTH_TUPLE. You will have to get your own ACCESSID and SECRETKEY from the Moz.com website.

The API expects these two values to be passed as a Python data structure called a tuple. A tuple is a list of values that don’t change. I find it interesting that requests.post() expects flattened strings for the data parameter, but expects a tuple for the auth parameter. I suppose it makes sense, but these are the subtle things to understand when working with APIs.

Here’s the full code:

import json
import pprint
import requests

# Set Constants
ACCESSID = "mozscape-1234567890"  # Replace with your access ID
SECRETKEY = "1234567890abcdef1234567890abcdef"  # Replace with your secret key
AUTH_TUPLE = (ACCESSID, SECRETKEY)

# Set Variables
endpoint = "https://lsapi.seomoz.com/v2/anchor_text"
data_dict = {"target": "moz.com/blog", "scope": "page", "limit": 1}
json_string = json.dumps(data_dict)

# Make the Request
response = requests.post(endpoint, data=json_string, auth=AUTH_TUPLE)

# Print the Response
pprint(response.json())

…which outputs:

{'next_token': 'JYkQVg4s9ak8iRBWDiz1qTyguYswnj035nqrQ1oIbW96IGJsb2dZgGzDeAM7Rw==',
 'results': [{'anchor_text': 'moz',
              'external_pages': 7162,
              'external_root_domains': 2026}]}

Using all upper case for the AUTH_TUPLE variable is a convention many use in Python to indicate that the variable is a constant. It’s not a requirement, but it’s a good idea to follow conventions when you can.

You may notice that I didn’t use all uppercase for the endpoint variable. That’s because the anchor_text endpoint is not a constant. There are a number of different endpoints that can take its place depending on what sort of lookup we wanted to do. The choices are:

  1. anchor_text

  2. final_redirect

  3. global_top_pages

  4. global_top_root_domains

  5. index_metadata

  6. link_intersect

  7. link_status

  8. linking_root_domains

  9. links

  10. top_pages

  11. url_metrics

  12. usage_data

And that leads into the Jupyter Notebook that I prepared on this topic located here on Github. With this Notebook you can extend the example I gave here to any of the 12 available endpoints to create a variety of useful deliverables, which will be the subject of articles to follow.

Friday, May 26, 2023

The Ultimate Low-Hanging Fruit SEO Strategy — Whiteboard Friday

We all know that we want to maximize our chances for success in SEO, and for that, what we want to do is prioritize tasks that will have a higher impact, and lower effort, but sometimes those get lost in the SEO audit process. In today’s Whiteboard Friday, Aleyda helps develop this low-hanging fruit analysis in parallel of the usual SEO process.

low-hanging fruit SEO strategy

Click on the whiteboard image above to open a high resolution version in a new tab!

Video Transcription

Welcome to a new edition of Whiteboard Friday. My name is Aleyda Solis. I am an SEO consultant and founder of Orainti, and today I am here to share with you low-hanging fruit SEO. We all know that we want to maximize the opportunities, the chances for success in SEO, and for that, what we tend to do is to prioritize those tasks, those activities that will tend to have a higher output, a higher impact, and lower effort.

Although it's true that this usually depends on the context of our SEO process or project, the restrictions, the opportunities, the resources, the flexibility, etc., the reality is that it tends to be always this let's say strategic, agnostic types of activities that tend to be always there for us to leverage, right?

However, what we tend to do in our SEO processes is this, right? We start the SEO process with an audit, research from keyword competition research to technical SEO, content audit, competition analysis, backlink analysis, etc. This tends to take a little bit of time, like four weeks or so, for example, let's say.

Then we need to analyze all of the data, etc. in order to generate actionable, prioritized SEO recommendations that, at the end of the day, are the ones that we share with our SEO clients or SEO stakeholders in general for execution, right? So all of this process tends to take a little bit of time. Unfortunately, the issue here is that after this time, we tend to face challenges about like, yes, impatience of the stakeholders or the owners of the project, right, and it's natural.

However, as I mentioned before, we can and what I propose here is to develop this low-hanging fruit analysis in parallel of the usual SEO process audit in order to detect this low-hanging fruits that we tend to have, and I will share later on which, in order to start implementation right away, right?

This might seem counterintuitive because you may say, "Oh my God, Aleyda, extra work, besides the one of the audits." But the reality is that ideally here we should set already some frameworks, some reports with data that we tend already to have in the SEO process in order to implement this, right? The benefits of this low-hanging fruit analysis and the implementation that we can start right when we are already doing the usual audit is that it will mitigate impatience from clients or stakeholders.

We will start with those actions that will be much like easier or simpler to coordinate, right? So what I'm talking about here about low-hanging fruit, realistically, I am going to go through three scenarios here of these low-hanging fruit opportunities that very likely will also be applicable for any of your projects, right?

Improving the click-through rate of top ranked pages. If we go and take a look at our current rankings using whatever ranking tools that you use, Google Search Console even, you can take a look at which are those top ranked pages that are already ranking for relevant queries, that are really important and meaningful for you, that have opportunities to improve their click-through rates, that the click-through rates are too low for the rankings of these pages.

You can try to identify if something is off with the snippets, with the titles, with the meta descriptions, for example, or if these pages are not maximizing the visibility because of the lack of structured data implementation and the reason why they are not generating rich snippets or included in a very important, meaningful relevance or feature, for example.

That is the reason of why the click-through rate is too low. You can go and straight forward improve those, right? With the snippets too, I have to say I have found many more scenarios in which Google was rewriting the title, which is now more common than before. Even if Google tries to rewrite it in a way that is still meaningful and relevant, the core key aspect of that particular page a few times has been eliminated.

Or maybe the core page is still there, but when you compare it with your top competitors, with all the pages ranking in that same SERPs, you identify that they are actually showing additional data, additional insights that you are not because of being cut out and, well, that is certainly a missed opportunity for you.

So go and take a look and prioritize the analysis, very straightforward analysis with the data that you already have for those snippets and those search features that you could be leveraging but are not. Then you want to take a look also at those relevant queries that you are ranking with not relevant pages. Maybe in the past, you created pages that better match the intent for those queries.

Not anymore. Or maybe you created at some point many different pages targeting similar queries that made sense in the past. But not anymore either, right? You may find scenarios of content cannibalization issues or lack of content issues, right? For that, what I would highly, highly, highly recommend is to analyze for which of your relevant queries you're ranking with more than one page to identify, to assess if this is detrimental in that scenario.

If less people are clicking or nowhere to click because of that, if you could be consolidating these pages in order to run better, to pass the value to a single page, and to consolidate all the metrics in a single page instead. For that, I highly, highly recommend to check those relevant queries for which you have more than a single page, right?

Then which is the right page to rank? If it is better to just 301 redirect to a single URL or to differentiate this additional page that you have there because you can identify that it might be also valuable to just tweak it a little bit or optimize it a little bit to refer it and to rank to another query that is equally as relevant for you too. Second scenario here for low-hanging fruit opportunities is to optimize internal links of almost ranking pages, right?

You probably have these pages that are not yet in that top three or top five positions as these others, but are in the top ten already, top six, top seven, etc., etc., almost ranking for very, very important, meaningful, highly searchable, highly relevant search queries. But when you analyze these pages, you identify very quickly that they are relevant.

The content is okay, but it's the lack of backlinks that is holding you back, right? So how do we do this? Whenever you're analyzing these pages, you want to grab, you want to take a look at all the backlinks per page, like very quick backlink, all the internal links per page. When you crawl your website, you will see how many internal links each of these have from all of the different pages of your website.

You want to pretty much consolidate this data in a single sheet to identify those cases of these pages for which you're in position four, position five, position six that potentially might have a lot of backlinks, but very few internal links or vice versa, you're linking from each of your internal pages but have very, very few backlinks.

So there might be opportunities here too, and for that, you should better link to almost ranking pages for popular queries that you're not internal linking well from the footer, from the top navigation, from secondary navigation, for example. For those popular pages that have a lot of backlinks, for example, but they're not necessarily passing well the value to those meant to be ranked pages, you can leverage this to better cross-link to those, right?

For those that what they are lacking is not internal links but backlinks, you already have great candidates to start your link building campaigns with already. So this can also accelerate a little bit the analysis that you're doing in parallel. Last but not least, detect search shifts of content decay. There might be content that you created some time ago, some years ago, that it was perfect at that time to target and to rank for certain queries, but potentially Google later on updated or shifted the rank pages for this query because they identified that the intent was different, that they changed.

I have seen many scenarios in which very broad queries that used to list a lot of PLPs, product listing pages are nowadays ranking more guides and far less product listing pages, right? So you want to identify these shifts. Also, potentially some articles that you wrote like a few years ago, that were like the top or the best tools for this or that or the top or the best product for this or that, they need a little bit of an update, right?

You forgot that they needed to be updated every year, for example. So these are the scenarios that I am talking about here. For this, it's critical to go and take a look at, again, your rank tracking data or even your Google Search Console and identify like the number of clicks, the position, and the click-through rate that your top content, your meaningful content through the customer journey has been getting in the last few months to see if it is going down, if it is dropping, right?

If that's the case, you go and take a look at it and see if there's opportunity to refresh or diversify a little bit, depending on the scenario for which queries the content is dropping and update the existing content to keep its relevance based on the other top ranked pages, right? If you see that you're dropping a lot and which are those other pages that are like now outranking you to identify the gap versus yours.

Also, create new content to better fulfill the need in case you identify that no, no, no, no, the page that I was targeting to rank for this query, it doesn't make sense anymore because now Google is ranking much more informational content and this was much more commercially driven or transactional driven, right? So you can again prioritize much faster the development of these other types of content.

So as you can see with this very low-hanging fruit I will say, with data that you tend to already have within the SEO analysis, you can accelerate in parallel this analysis to identify low-hanging fruit opportunity that you can start executing right away, see results faster, mitigate the impatience of your clients, and all the gains much easier with your SEO process.

So hopefully this will serve to you to apply through the different projects that you work in and achieve results faster. Thank you very much.

Video transcription by Speechpad.com