In this article, we will be looking into Links and Images, and the various scenarios, that can be handled using Selenium.

1. Accessing Links using Link Text and Partial Link Text

The easiest way to access links on a web page is using the locators’ linkText and partialLinkText. LinkText matches the entire text to find the element, whereas PartialLinkText uses the partial text to match the element.

1
2
//HTML Code
<a href="http://www.testersdock.com">Click Here</a>
1
2
3
//Selenium Code
WebElement linkTextEle= driver.findElement(By.linkText("Click Here"));
WebElement partialEle= driver.findElement(By.PartialLinkText("Here"));

Two things to keep in mind:
1(a). If there are more than one link texts with the same name, the above locators will identify the first occurrence only.

1
2
3
4
//HTML Code
<a href="http://www.google.com">Click Here</a>
<a href="http://www.bing.com">Click Here</a>
<a href="http://www.yahoo.com">Click Here</a>
1
2
3
//Selenium Code
WebElement linkTextEle= driver.findElement(By.linkText("Click Here"));
WebElement partiallinkTextEle= driver.findElement(By.PartialLinkText("Here"));

In the above example, only the first Click Here for google.com will be identified.

1(b). The parameters for both locators are case sensitive, meaning Click Here and click here are considered two unique entities.

1
2
3
//HTML Code
<a href="http://www.google.com">Click Here</a>
<a href="http://www.bing.com">click here</a>

In the above example, we can access both the links at once, using the case-sensitivity property.

1
2
3
//Selenium Code
WebElement linkTextEle= driver.findElement(By.linkText("Click Here"));
WebElement partialEle= driver.findElement(By.PartialLinkText("here"));

 

2. Finding links on a web page

To capture all the links on a web page, we will first store all the ‘a’ tags in an array list and print them using an iterator.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
import java.util.Iterator;
import java.util.List;
import org.openqa.selenium.By;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;

public class AllLinks {

public static void main(String[] args) {

String url;

// Telling Selenium to find Chrome Driver
System.setProperty("webdriver.chrome.driver", "C:\\selenium\\chromedriver.exe");

// Initialize browser
ChromeDriver driver = new ChromeDriver();

// Launch Yahoo
driver.get("http://yahoo.com/");

// Store all links with the 'a' tag in a Array List
List Alllinks = driver.findElements(By.tagName("a"));

// Iterate the Array List
Iterator iterate = Alllinks.iterator();

while (iterate.hasNext()) {
// Print the URL
url = iterate.next().getAttribute("href");
System.out.println(url);
}

// Close the browser
driver.quit();
}

We should get an output like this:

all links webpage
 

3. Finding Broken links

To check whether a URL is working fine or broken, we need to have its Http Status Codes. HTTP status codes are response codes given by servers, which helps in identifying the cause of the errors.

HTTP Status CodesSignificance
1xx (Informational) The response indicates that the request was received and understood.
2xx (Success) The response indicates the action requested by the client was received, understood, accepted, and processed successfully.
3xx (Redirection) The response indicates the client must take additional action to complete the request.
4xx (Client Error) The response is intended for situations in which the error seems to have been caused by the client.
5xx (Server Error) The response is received when a server fails to fulfill an apparently valid request.

As we can see in the above table, any links for which the response code is 2xx is a valid link. So using getResponseCode() method we will first identify the HTTP response code for each link. The getResponseCode() method returns two values: 200 (for valid links) and 401 (for invalid links). Now, if a link has a response code of 200, then that link will be considered as valid, and the rest will be considered as invalid or broken.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;
import javax.net.ssl.HttpsURLConnection;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.firefox.FirefoxDriver;

public class BrokenLinks {

 public static void main(String[] args) {

  // Telling Selenium to find Chrome Driver
  System.setProperty("webdriver.chrome.driver", "C:\\selenium\\chromedriver.exe");

  // Initialize browser
  ChromeDriver driver = new ChromeDriver();

  // Maximize Browser Window
  driver.manage().window().maximize();

  // Launch Google
  driver.get("https://www.google.co.in/");

  // Storing all the 'a' tags in a Array List
  List links = driver.findElements(By.tagName("a"));

  //Displaying the total number of links in the Webpage
  System.out.println("Total links are " + links.size());

  for (int i = 0; i <= links.size(); i++) {
   try {
    // Iterating through the array list and getting the URL's
    String nextHref = links.get(i).getAttribute("href");

    // Getting the Response Code for URL
    URL url = new URL(nextHref);
    HttpURLConnection connection = (HttpURLConnection) url.openConnection();
    connection.setRequestMethod("GET");
    connection.connect();
    int code = connection.getResponseCode();

    // Condition to check whether the URL is valid or Invalid
    if (code == 200) System.out.println("Valid Link:" + nextHref);
    else System.out.println("INVALID Link:" + nextHref);
   } catch (Exception e) {
    System.out.println(e.getMessage());
   }
  }
  driver.quit();
 }
}

On executing it, we should get a list of URLs along with its status as Valid or Invalid.

invalid links
 

4. Finding Broken Images

An image tag looks like this: <img src=”link”></img>. So first, we will search all the images using the img tag and then will fetch the image URL using the src tag. After that, as mentioned above, we will look for 200 statuses for valid images and 401 for invalid images.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
import java.net.HttpURLConnection;
import java.net.URL;
import java.util.List;
import javax.net.ssl.HttpsURLConnection;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
import org.openqa.selenium.firefox.FirefoxDriver;

public class BrokenImages {

public static void main(String[] args) {

// Telling Selenium to find Chrome Driver
System.setProperty("webdriver.chrome.driver", "C:\\selenium\\chromedriver.exe");

// Initialize browser
ChromeDriver driver = new ChromeDriver();

// Maximize Browser Window
driver.manage().window().maximize();

// Launch Pixabay
driver.get("https://www.pixabay.com/");

// Storing all the 'img' tags in a Array List
List links = driver.findElements(By.tagName("img"));

// Displaying the total number of links in the Webpage
System.out.println("Total links are " + links.size());

for (int i = 0; i <= links.size(); i++) {
try {
// Iterating through the array list and getting the URL's
String nextHref = links.get(i).getAttribute("src");

// Getting the Response Code for URL
URL url = new URL(nextHref);
HttpURLConnection connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod("GET");
connection.connect();
int code = connection.getResponseCode();

// Condition to check whether the URL is valid or Invalid
if (code == 200) {
System.out.println("Valid Image:" + nextHref);
} else {
System.out.println("INVALID Image:" + nextHref);
}          
} catch (Exception e) {
System.out.println(e.getMessage());
}
}
// Close the browser
driver.quit();
}
}

After Execution, we should get a list of valid/invalid URLs.

image links