Friday, August 15, 2014

How to read data and download file from internet in java

Java provides classes for network programming in the package This article shows you how to read data and download file from the Internet in Java using the class URL
Computers connected to a network can communicate with each other only if they agree on the rules of communication, called protocols, that define how to request the data. The World Wide Web (WWW) uses uniform resource locators (URLs) to identify online resources. For example, the following URL says that there is a file called training.html located at the remote host known as, and that the program should use the HTTP protocol to request this file. It also states that this request has to be sent via port 80.
The hostname must be unique and it is automatically converted to the IP address of the physical server by your Internet service provider (ISP).

Finding a resource online is somewhat similar to finding a person by his or her address. The role of an IP address is similar to the role of a street number of a building, and a port plays the role of an apartment number in that building. Many people can live in the same building, just as many programs can run on the same server. A port is simply a unique number assigned to a server program running on the machine.

A URL has two main components:

Note that the protocol identifier and the resource name are separated by a colon and two forward slashes.

How to create a URL
The easiest way to create a URL object is from a String that represents the human-readable form of the URL address. This is typically the form that another person will use for a URL. In your Java program, you can use a String containing this text to create a URL object:
URL myURL = new URL("");
The URL object created above represents an absolute URL. An absolute URL contains all of the information necessary to reach the resource in question. You can also create URL objects from a relative URL address.

How to create URL Relative to Another
In your Java programs, you can create a URL object from a relative URL specification. For example, suppose you know two URLs at the site

You can create URL objects for these pages relative to their common base URL: like this:

URL myURL = new URL("");
URL page1URL = new URL(myURL, "page1.html");
URL page2URL = new URL(myURL, "page2.html");

In order to specify these URL, URL class provides a different constructor for them. For instance,
new URL("http", "", "/pages/page1.html");
This is equivalent to
new URL("");

Each of the four URL constructors throws a MalformedURLException if the arguments to the constructor refer to a null or unknown protocol. Typically, you want to catch and handle this exception by embedding your URL constructor statements in a try/catch pair
Note: URLs are "write-once" objects. Once you've created a URL object, you cannot change any of its attributes (protocol, host name, filename, or port number).

Parsing a URL
The URL class provides several methods that let you query URL objects. You can get the protocol, authority, host name, port number, path, query, filename, and reference from a URL using these accessor methods. For more detail on these method check URL class.
Protocol : http
Port : 80
Host :
Path : /2014/07/how-java-program.html

Connecting to a URL
After you've successfully created a URL object, you can call the URL object's openConnection method to get a URLConnection object, or one of its protocol specific subclasses, e.g. You can use this URLConnection object to setup parameters and general request properties that you may need before connecting. Connection to the remote object represented by the URL is only initiated when the URLConnection.connect method is called. When you do this you are initializing a communication link between your Java program and the URL over the network. For example, the following code opens a connection to the site

  • A new URLConnection object is created every time by calling the openConnection method of the protocol handler for this URL.
  • You are not always required to explicitly call the connect method to initiate the connection. Operations that depend on being connected, like getInputStream, getOutputStream, etc, will implicitly perform the connection, if necessary.

Reading Data from the Internet
There are two ways to read data from internet.
  1. Reading Directly from a URL
  2. Read from URLConnection
Reading Directly from a URL
After you've successfully created a URL, you can call the URL's openStream() method to get a stream from which you can read the contents of the URL. The openStream() method returns a object, so reading from a URL is as easy as reading from an input stream. The following small Java program uses openStream() to get an input stream on the URL It then opens a BufferedReader on the input stream and reads from the BufferedReader thereby reading from the URL

When you run the program, you should see, scrolling by in your command window, the HTML commands and textual content from the HTML file located at

Read from URLConnection
However, rather than getting an input stream directly from the URL, this program explicitly retrieves a URLConnection object and gets an input stream from the connection. The connection is opened implicitly by calling getInputStream. Then, like URLReader, this program creates a BufferedReader on the input stream and reads from it. 
The output from this program is identical to the output from the program that opens a stream directly from the URL. You can use either way to read from a URL. However, reading from a URLConnection instead of reading directly from a URL might be more useful. This is because you can use the URLConnection object for other tasks (like writing to the URL) at the same time.

Connecting through HTTP Proxy Servers
For security reasons, most enterprises use firewalls to block unauthorized access to their internal networks. As a result their employees can’t directly reach the outside Internet world (or even some internal servers), but go through HTTP proxy servers.
Check the settings of your Internet browser to see if you are also sitting behind a firewall, and find out the hostname and port number of the proxy server if you are. Usually, web browsers store proxy parameters under the Advanced tabs of their Settings or Preferences menus.
For example, if the name of your proxy server is and it runs on port 8080.

Three method for connecting through HTTP Proxy Servers
  1. The following two lines should be added to your Java application that needs to connect to the Internet:
    System.setProperty(“http.proxyPort”, 8080);
  2. If you do not want to hardcode these values, pass them to your program from the command line:
    java -Dhttp.proxyHost= –Dhttp.proxyPort=8080 programName
  3. You can programmatically specifying proxy parameters is to do it via the class The code for the same proxy server parameter would look like this (you can replace the name of the server with an IP address):
    Proxy myProxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress (“”, 8080);
    url = new URL(“” );
    urlConn = url.openConnection(myProxy);

How to Download Files from the Internet
Combine the class URL with the reading files techniques and you should be able to download practically any unprotected file (such as images, music, and binary files) from the Internet. The trick is in opening the file stream properly.
In order to demonstrate this program, I'm using google drive link to download txt file from internet.
After execution of this program, you will find download.txt file in your D drive.

If you know anyone who has started learning java, why not help them out! Just share this post with them. Thanks for studying today!...

1 comment:

  1. Wow !! Such an informational page this is, Which provide all the information through reading to downloading the data from the internet. This page has a really helpful resource for JAVA Training.