How to handle Links and Images using Selenium Webdriver

In this article, we will be looking into Links and Images, and the various scenarios, that can be handled using Selenium.

1. Accessing Links using Link Text and Partial Link Text

The easiest way to access links on a web page is using the locators’ linkText and partialLinkText. LinkText matches the entire text to find the element, whereas PartialLinkText uses the partial text to match the element.

1 2	//HTML Code <a href="http://www.testersdock.com">Click Here</a>

1
2
3

//Selenium Code
WebElement linkTextEle= driver.findElement(By.linkText("Click Here"));
WebElement partialEle= driver.findElement(By.PartialLinkText("Here"));

Two things to keep in mind:
1(a). If there are more than one link texts with the same name, the above locators will identify the first occurrence only.

1
2
3
4

//HTML Code
<a href="http://www.google.com">Click Here</a>
<a href="http://www.bing.com">Click Here</a>
<a href="http://www.yahoo.com">Click Here</a>

1
2
3

//Selenium Code
WebElement linkTextEle= driver.findElement(By.linkText("Click Here"));
WebElement partiallinkTextEle= driver.findElement(By.PartialLinkText("Here"));

In the above example, only the first Click Here for google.com will be identified.

1(b). The parameters for both locators are case sensitive, meaning Click Here and click here are considered two unique entities.

1
2
3

//HTML Code
<a href="http://www.google.com">Click Here</a>
<a href="http://www.bing.com">click here</a>

In the above example, we can access both the links at once, using the case-sensitivity property.

1
2
3

//Selenium Code
WebElement linkTextEle= driver.findElement(By.linkText("Click Here"));
WebElement partialEle= driver.findElement(By.PartialLinkText("here"));

2. Finding links on a web page

To capture all the links on a web page, we will first store all the ‘a’ tags in an array list and print them using an iterator.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36

import java.util.Iterator;
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;

public class AllLinks {

public static void main(String[] args) {

String url;

// Telling Selenium to find Chrome Driver
System.setProperty("webdriver.chrome.driver", "C:\\selenium\\chromedriver.exe");

// Initialize browser
ChromeDriver driver = new ChromeDriver();

// Launch Yahoo
driver.get("http://yahoo.com/");

// Store all links with the 'a' tag in a Array List
List Alllinks = driver.findElements(By.tagName("a"));

// Iterate the Array List
Iterator iterate = Alllinks.iterator();

while (iterate.hasNext()) {
// Print the URL
url = iterate.next().getAttribute("href");
System.out.println(url);
}

// Close the browser
driver.quit();
}

We should get an output like this:

3. Finding Broken links

To check whether a URL is working fine or broken, we need to have its Http Status Codes. HTTP status codes are response codes given by servers, which helps in identifying the cause of the errors.

HTTP Status Codes	Significance
1xx (Informational)	The response indicates that the request was received and understood.
2xx (Success)	The response indicates the action requested by the client was received, understood, accepted, and processed successfully.
3xx (Redirection)	The response indicates the client must take additional action to complete the request.
4xx (Client Error)	The response is intended for situations in which the error seems to have been caused by the client.
5xx (Server Error)	The response is received when a server fails to fulfill an apparently valid request.

As we can see in the above table, any links for which the response code is 2xx is a valid link. So using getResponseCode() method we will first identify the HTTP response code for each link. The getResponseCode() method returns two values: 200 (for valid links) and 401 (for invalid links). Now, if a link has a response code of 200, then that link will be considered as valid, and the rest will be considered as invalid or broken.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54

import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;
import javax.net.ssl.HttpsURLConnection;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.firefox.FirefoxDriver;

public class BrokenLinks {

public static void main(String[] args) {

// Telling Selenium to find Chrome Driver
System.setProperty("webdriver.chrome.driver", "C:\\selenium\\chromedriver.exe");

// Initialize browser
ChromeDriver driver = new ChromeDriver();

// Maximize Browser Window
driver.manage().window().maximize();

// Launch Google
driver.get("https://www.google.co.in/");

// Storing all the 'a' tags in a Array List
List links = driver.findElements(By.tagName("a"));

//Displaying the total number of links in the Webpage
System.out.println("Total links are " + links.size());

for (int i = 0; i <= links.size(); i++) {
try {
// Iterating through the array list and getting the URL's
String nextHref = links.get(i).getAttribute("href");

// Getting the Response Code for URL
URL url = new URL(nextHref);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
int code = connection.getResponseCode();

// Condition to check whether the URL is valid or Invalid
if (code == 200) System.out.println("Valid Link:" + nextHref);
else System.out.println("INVALID Link:" + nextHref);
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
driver.quit();
}
}

On executing it, we should get a list of URLs along with its status as Valid or Invalid.

4. Finding Broken Images

An image tag looks like this: <img src=”link”></img>. So first, we will search all the images using the img tag and then will fetch the image URL using the src tag. After that, as mentioned above, we will look for 200 statuses for valid images and 401 for invalid images.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58

import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;
import javax.net.ssl.HttpsURLConnection;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.firefox.FirefoxDriver;

public class BrokenImages {

public static void main(String[] args) {

// Telling Selenium to find Chrome Driver
System.setProperty("webdriver.chrome.driver", "C:\\selenium\\chromedriver.exe");

// Initialize browser
ChromeDriver driver = new ChromeDriver();

// Maximize Browser Window
driver.manage().window().maximize();

// Launch Pixabay
driver.get("https://www.pixabay.com/");

// Storing all the 'img' tags in a Array List
List links = driver.findElements(By.tagName("img"));

// Displaying the total number of links in the Webpage
System.out.println("Total links are " + links.size());

for (int i = 0; i <= links.size(); i++) {
try {
// Iterating through the array list and getting the URL's
String nextHref = links.get(i).getAttribute("src");

// Getting the Response Code for URL
URL url = new URL(nextHref);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
int code = connection.getResponseCode();

// Condition to check whether the URL is valid or Invalid
if (code == 200) {
System.out.println("Valid Image:" + nextHref);
} else {
System.out.println("INVALID Image:" + nextHref);
}
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
// Close the browser
driver.quit();
}
}

After Execution, we should get a list of valid/invalid URLs.

6 Comments

Tuhina on September 28, 2018 at 7:14 am
Well written
- Alapan Das on October 21, 2018 at 6:37 pm
  Thanks Tuhina for your comment.
Tajeshwar SIngh on November 21, 2018 at 4:30 pm
Well written and Explained. Thanks for the article
- Alapan on November 21, 2018 at 6:21 pm
  Thanks for the Kind Words Tajeshwar 🙂
Marg Mcdilda on November 28, 2018 at 11:36 am
thanks to the author for taking his clock time on this one.
Subash Patil on April 14, 2023 at 2:43 pm
Very well explained