{"id":3032,"date":"2024-03-22T16:00:33","date_gmt":"2024-03-22T16:00:33","guid":{"rendered":"https:\/\/msgprogramator.sk\/?p=3032"},"modified":"2025-07-07T10:57:30","modified_gmt":"2025-07-07T10:57:30","slug":"read-java-files","status":"publish","type":"post","link":"https:\/\/msgprogramator.sk\/en\/read-java-files\/","title":{"rendered":"Java File Handling Part 1: How to read files in Java quickly and efficiently"},"content":{"rendered":"<p>Java file handling is a fundamental aspect of programming. There are many different ways to read files in Java.  Just by looking for a Java files example for simple file reading you may come across Java classes like <strong><em> InputStream<\/em><\/strong>, <strong><em>FileInputStream<\/em><\/strong>, <strong><em>DataInputStream<\/em><\/strong>, <strong><em>SequenceInputStream<\/em><\/strong>, <strong><em>Reader<\/em><\/strong>, <strong><em>InputStreamReader<\/em><\/strong>, <strong><em>FileReader<\/em><\/strong>, <strong><em>BufferedReader<\/em><\/strong>, <strong><em>FileChannel<\/em><\/strong>, <strong><em>SeekableByteChannel<\/em><\/strong>, <strong><em>Scanner, StreamTokenizer<\/em><\/strong>, <strong><em>Files<\/em><\/strong> and others.<\/p>\n<p>We\u2019re pretty sure there are even more classes available, and some didn\u2019t make this list. Of course, we haven\u2019t yet mentioned third-party external libraries for working with files, which are also an option. <\/p>\n<p>Most of the pre-defined classes for reading and writing files in Java are located in the <strong>java.io<\/strong> and <strong>java.nio.file<\/strong> packages. With the introduction of the new Java NIO.2 (New I\/O) file API, the situation became even more complex, and how to work with files efficiently is often a common question on programming forums \u2014 even among experienced developers.<\/p>\n<p>When working with files, we should know in advance what types of files we will be working with and therefore whether we need to read binary files (e.g. music in mp3 format) or text files. We should know whether the files are small enough to load entirely into memory or so large that they require streamed processing (e.g., line by line). <\/p>\n<p>Each of the file handling classes mentioned above has its use for specific cases. In general, however, we use binary data retrieval <strong><em>Stream<\/em><\/strong> classes and text <strong><em>Reader<\/em><\/strong> classes. Classes that have both of these expressions in their name combine binary and text data retrieval. For example, <strong><em>InputStreamReader<\/em><\/strong> consumes an <strong><em>InputStream<\/em><\/strong>, but behaves as <strong><em>Reader<\/em><\/strong>. <strong><em>FileReader<\/em><\/strong> is basically a combination of <strong><em>FileInputStream<\/em><\/strong> and<strong><em>InputStreamReader<\/em><\/strong>.<\/p>\n<p>As we can see, reading data in Java can get messy. So, in this article, we\u2019ll focus on three common scenarios that cover 90% of use cases:<\/p>\n<ul>\n<li>Reading an entire text file into a <strong>String <\/strong> or a List (or a binary file into a byte[]). <\/li>\n<li>Reading and processing large files that don&#8217;t fit in memory.<\/li>\n<li>Reading files with structured content (e.g., CSV files split by a separator). <\/li>\n<\/ul>\n<h2>A brief history of file reading in Java<\/h2>\n<p>Before Java 7, reading files was cumbersome. The most common approach was using <strong><em>FileInputStream<\/em><\/strong>, which required manual resource cleanup (closing the stream in both success and error cases). Automatic resource management (via <em>try-with-resources<\/em>) didn\u2019t exist yet, so many developers preferred third-party libraries like <a href=\"https:\/\/commons.apache.org\/\" target=\"_blank\" rel=\"nofollow noopener\">Apache Commons IO<\/a> or <a href=\"https:\/\/en.wikipedia.org\/wiki\/Google_Guava\" target=\"_blank\" rel=\"nofollow noopener\">Google Guava<\/a> for simpler file operations.   <\/p>\n<p>This changed with Java 7, which introduced the <em>NIO.2 File API, <\/em>including the <strong> java.nio.file.Files<\/strong> helper class. This class provides convenient one-line methods for reading entire text\/binary files.<\/p>\n<h2>Reading a binary file into a byte array<\/h2>\n<p>Using the <em>Files.readAllBytes()<\/em> method, we can read the contents of the entire file into the byte array:<\/p>\n<pre><code class=\"language-java\" data-line=\"\">import java.nio.file.Files;\nimport java.nio.file.Path;\n\nString fileName = &quot;fileName.dat&quot;;\nbyte[] bytes = Files.readAllBytes(Path.of(fileName));\n<\/code><\/pre>\n<p>The <em>Path <\/em> class represents a file\u2019s location in the filesystem.<\/p>\n<h2>Reading a text file into a variable of the String type<\/h2>\n<p>Since Java 11, it is possible to simply read the contents of an entire text file into a variable of type String using the <em>Files.readString()<\/em> method as follows:<\/p>\n<pre><code class=\"language-java\" data-line=\"\">import java.nio.file.Files;\nimport java.nio.file.Path;\n\nString fileName = &quot;fileName.dat&quot;;\nString text = Files.readString(Path.of(fileName));\n<\/code><\/pre>\n<p>The <em>readString()<\/em> method uses the <em>readAllBytes()<\/em> method internally, and then converts the binary data into the desired string of type String.<\/p>\n<h2>Reading a text file line by line<\/h2>\n<p>Text files usually consist of multiple lines. If we want to read and process the text line by line, we can use a method available since Java 8 &#8211; readAllLines(), which does this automatically.<\/p>\n<pre><code class=\"language-java\" data-line=\"\">import java.nio.file.Files;\nimport java.nio.file.Path;\nimport java.util.List;\n\nString fileName = &quot;fileName.dat&quot;;\nList&lt;String&gt; lines = Files.readAllLines(Path.of(fileName));\n<\/code><\/pre>\n<p>Then we just iterate through the list and process each row.<\/p>\n<h2>Reading a text file line by line using String stream<\/h2>\n<p>Java 8 introduced streams as a significant language enhancement. The same version extended the <em>Files<\/em> class with a new <em>lines()<\/em> method that returns the read lines of a text file as a stream of strings of type String. This allows us to use the functionality of streams e.g. when filtering data. <\/p>\n<pre><code class=\"language-java\" data-line=\"\">import java.nio.file.Files;\nimport java.nio.file.Path;\n\nString fileName = &quot;fileName.dat&quot;;\nFiles.lines(Path.of(fileName))\n        .filter(line -&gt; line.contains(&quot;ERROR&quot;))\n        .forEach(System.out::println);\n<\/code><\/pre>\n<p>In this example, we will output to the console all the lines of the read file that contains the string &#8220;ERROR&#8221;.<\/p>\n<p>These methods cover the most common scenarios for reading small files and share the characteristic that they are read entirely into RAM. For large files, it is advisable to read them in chunks and process them immediately. We will demonstrate this below.<\/p>\n<h2>Reading a large binary file using BufferedInputStream<\/h2>\n<p>The binary file is read via <em>InputStream<\/em> one byte at a time (until the end of the file when -1 is returned), which is quite long in the case of large files. This can be speeded up by reading data via <em>BufferedInputStream<\/em>, which wraps the <em>FileInputStream<\/em> class and reads data from the operating system no longer byte by byte, but in 8 KB blocks that are stored in memory. Subsequently, the reading of the file is done byte by byte, but it is much faster because it is done directly from memory.<\/p>\n<pre><code class=\"language-java\" data-line=\"\">import java.io.BufferedInputStream;\nimport java.io.FileInputStream;\n\nString fileName = &quot;fileName.dat&quot;;\ntry (FileInputStream is = new FileInputStream(fileName);\n     BufferedInputStream bis = new BufferedInputStream(is)) {\n    int b;\n    while ((b = bis.read()) != -1) {\n        \/\/ TODO: process b\n    }\n}\n<\/code><\/pre>\n<p>Reading a file block by block (called buffering) is significantly faster than reading by bytes.<\/p>\n<h2>Reading a large text file using BufferedReader<\/h2>\n<p><strong><em>FileReader<\/em><\/strong> class combines <strong><em>FileInputStream<\/em><\/strong> a <strong><em>InputStreamReader<\/em><\/strong>. For faster files reading we use the class <strong><em>BufferedReader<\/em><\/strong> which wraps the class <strong><em>FileReader<\/em><\/strong> and allows to use an 8 KB buffer along with an additional buffer for 8192 decoded characters. The advantage of the <strong><em>BufferedReader<\/em><\/strong> class is that it allows us to read and process the text file line by line (instead of reading and processing on a character by character basis).<\/p>\n<pre><code class=\"language-java\" data-line=\"\">import java.io.BufferedReader;\nimport java.io.FileReader;\n\nString fileName = &quot;fileName.dat&quot;;\ntry (FileReader reader = new FileReader(fileName);\n     BufferedReader bufferedReader = new BufferedReader((reader))) {\n    String line;\n    while ((line = bufferedReader.readLine()) != null) {\n        System.out.println(&quot;Line: &quot; + line);\n    }\n}\n<\/code><\/pre>\n<h2>Reading a file in parts using Scanner<\/h2>\n<p>Sometimes, instead of reading a file line by line, we need to read it in parts. <strong><em>Scanner<\/em><\/strong> works by dividing the contents of a file into parts using a separator, which can be any constant value. This class is commonly used for <strong>CSV<\/strong> (comma-separated values) files, which have a specific format where data is separated by commas. Such files can be used as tables in Excel applications.<\/p>\n<pre><code class=\"language-java\" data-line=\"\">import java.nio.file.Path;\nimport java.util.ArrayList;\nimport java.util.List;\nimport java.util.Scanner;\n\nList&lt;String&gt; words = new ArrayList&lt;&gt;();\nString fileName = &quot;fileName.csv&quot;;\nScanner scanner = new Scanner(Path.of(fileName));\nscanner.useDelimiter(&quot;,&quot;);\n\nwhile (scanner.hasNext()) {\n    String next = scanner.next();\n    words.add(next);\n}\nscanner.close();\n<\/code><\/pre>\n<p>In this example, we read the CSV file on a token-by-token basis (instead of the classic line-by-line reading approach) separated by commas and saved those in a list for further processing.<\/p>\n<p>In this article, we have shown the most common scenarios of reading data from a file. Java makes it easy to handle small files &#8211; you can read and write them to memory with a single function call. And when it comes to reading and processing large files, Java offers efficient solutions using buffering classes.<\/p>\n<p>If you&#8217;re a <a href=\"https:\/\/msg-life.sk\/en\/jobs\/java-programmer-senior\/\" target=\"_blank\" rel=\"noopener\">Java developer<\/a> looking for work, check out our <a href=\"https:\/\/msg-life.sk\/en\/benefits\/\" target=\"_blank\" rel=\"noopener\">employee benefits<\/a> and respond to our <a href=\"https:\/\/msg-life.sk\/en\/jobs\/\" target=\"_blank\" rel=\"noopener\">job offers<\/a>!<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In our article, we&#8217;ll focus on three basic scenarios for reading data that cover 90 percent of cases.<\/p>\n","protected":false},"author":14,"featured_media":3065,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"footnotes":""},"categories":[57],"tags":[],"class_list":["post-3032","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-java"],"acf":[],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/msgprogramator.sk\/en\/wp-json\/wp\/v2\/posts\/3032","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/msgprogramator.sk\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/msgprogramator.sk\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/msgprogramator.sk\/en\/wp-json\/wp\/v2\/users\/14"}],"replies":[{"embeddable":true,"href":"https:\/\/msgprogramator.sk\/en\/wp-json\/wp\/v2\/comments?post=3032"}],"version-history":[{"count":5,"href":"https:\/\/msgprogramator.sk\/en\/wp-json\/wp\/v2\/posts\/3032\/revisions"}],"predecessor-version":[{"id":5324,"href":"https:\/\/msgprogramator.sk\/en\/wp-json\/wp\/v2\/posts\/3032\/revisions\/5324"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/msgprogramator.sk\/en\/wp-json\/wp\/v2\/media\/3065"}],"wp:attachment":[{"href":"https:\/\/msgprogramator.sk\/en\/wp-json\/wp\/v2\/media?parent=3032"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/msgprogramator.sk\/en\/wp-json\/wp\/v2\/categories?post=3032"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/msgprogramator.sk\/en\/wp-json\/wp\/v2\/tags?post=3032"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}