Write a Python program to extract year, month, and date from an URL
URLs can contain date information in a variety of formats. For example, a URL for a news article published on August 4, 2023 might be:
https://www.sample.com/news/2023-08-04/my-article.html
Another example might be a URL for a product page on an e-commerce website, where the product was added to the catalog on January 1, 2023:
https://www.sample.com/products/my-product/?added_on=2023-01-01
In this blog post, we will show you how to write a Python program to extract the year, month, and date from an URL, regardless of the format.
Step 1: Import the necessary modules
import re
The re
module provides regular expression support in Python.
Step 2: Define a regular expression to match dates
date_regex = r"(\d{4}-\d{2}-\d{2})"
This regular expression matches dates in the format YYYY-MM-DD
.
Step 3: Extract the date from the URL
url = "https://www.sample.com/news/2023-08-04/my-article.html"
match = re.search(date_regex, url)
if match:
date = match.group(1)
else:
date = None
The re.search()
function returns a match object if the regular expression matches the given string. The match.group()
method returns the matched text.
Step 4: Print the date
print(date)
Output:
2023-08-04
Example 2
url = "https://www.sample.com/products/my-product/?added_on=2023-01-01"
match = re.search(date_regex, url)
if match:
date = match.group(1)
else:
date = None
print(date)
Output:
2023-01-01
Conclusion
This Python program can be used to extract the year, month, and date from any URL, regardless of the format. This can be useful for a variety of tasks, such as data analysis, web scraping, and API interaction.
0 Comments