On a recent project I was working on, I was assigned a task to detect the WordPress version on a bunch of supplied URLs. I did some research, had a better look at the source code that WP produces, and did some testing. I found out that there are three simplest and most reliable ways to get the version from a scraped WP source code.
Meta “generator” tag
On most WordPress sites, if not intentionally disabled, you can spot the generator
meta tag:
A straight forward regex like the one below will get you the WP version easily:
^WordPress\s+(\*|\d+(?:\.\d+){0,2}(?:\.\*)?)$
Feed URL <generator/> XML node
One can also examine the source code of the /feed
WP URL. Since that is an XML file it should be very easy to traverse to the <generator/>
node. In my particular case, I was using Python and the BeautifulSoup Library, so getting to the node was as simple as:
channel = soup.find('channel') generator = channel.find('generator')
Then from the generator node, you get the text content and get the version number with some simple regex, like:
^https://wordpress.org/\?v=(\*|\d+(?:\.\d+){0,2}(?:\.\*)?)$
wp-login.php source code
If you point your scraper to the /wp-login.php
page you will get a bunch of asset links (CSS, JS) that will have ?ver
query strings appended. For example: /wp-includes/css/dashicons.min.css?ver=5.6
or /wp-includes/css/buttons.min.css?ver=5.6
etc.
Then with a simple regex like the one below, you will be able to get the version from the node’s href attribute:
.+\?ver=(\*|\d+(?:\.\d+){0,2}(?:\.\*)?)$
From my experience, I was able to pick up the WP version on as close as 95% of the supplied URLs only with these three methods.
If you are aware of some other method, feel free to write it down in the comments!
1 Comment