Automatic file encoding detection in Java

A few months ago I worked on a process that imports Facebook Leads into a legacy system. Facebook sends its advertising data as UTF-16 encoded CSV. The tool also had to support the CSV files occasionally being ended by hand, which reverted the encoding to something a bit more standard. Thankfully, there was a small library out there that helped. So, in case you ever find yourself in need of guessing if a file is UTF-16 and don't want to roll your own, here you go:

File in = new File(inputFile);
if (!in.exists()) {
   throw new IllegalArgumentException("Input file not found");
Charset cs = CharsetToolkit.guessEncoding(in, 4096, StandardCharsets.UTF_8);
System.out.println("Reading " + inputFile + " as " +;
Reader r = new InputStreamReader(new FileInputStream(in),;

1 thought on “Automatic file encoding detection in Java”

Leave a Reply