π‘ Why I Built This
Many times, we receive reports or invoices in PDF format and need to analyze that data in Excel.
Most online converters either charge money or upload your file to a remote server β which can risk privacy.
So, I decided to create a simple, 100% browser-based PDF to Excel converter using only HTML and JavaScript.
βοΈ How It Works
- It uses PDF.js to extract all text from the uploaded PDF.
- The text is logically combined into lines using Y-coordinate mapping.
- You can preview the extracted text on the same page.
- Finally, using SheetJS (XLSX.js), it converts the data into Excel format and downloads it instantly.
π§© Technologies Used
- HTML5 + CSS3 for layout and design
- JavaScript for logic and file handling
- PDF.js β for reading PDF content
- SheetJS (XLSX.js) β for generating Excel files
π» Live Demo
π Source Code
Here is the complete JavaScript code used in this tool:
// JavaScript Logic
let allExtractedLines = [];
document.getElementById("uploadPdf").addEventListener("change", async function (e) {
const file = e.target.files[0];
if (!file || file.type !== "application/pdf") {
alert("Please upload a PDF file.");
return;
}
const reader = new FileReader();
reader.onload = async function () {
const typedarray = new Uint8Array(reader.result);
const pdf = await pdfjsLib.getDocument(typedarray).promise;
allExtractedLines = [];
let fullText = "";
for (let i = 1; i <= pdf.numPages; i++) {
const page = await pdf.getPage(i);
const textContent = await page.getTextContent();
let pageLines = [];
let currentLine = "";
let lastY = null;
textContent.items.forEach(item => {
const thisY = item.transform[5];
const text = item.str;
if (lastY === null || Math.abs(thisY - lastY) > 5) {
if (currentLine.trim()) pageLines.push(currentLine.trim());
currentLine = text;
} else {
currentLine += " " + text;
}
lastY = thisY;
});
if (currentLine.trim()) pageLines.push(currentLine.trim());
allExtractedLines.push(...pageLines);
fullText += pageLines.join("\n") + "\n";
}
document.getElementById("textOutput").value = fullText;
document.getElementById("exportBtn").disabled = false;
};
reader.readAsArrayBuffer(file);
});
document.getElementById("exportBtn").addEventListener("click", function () {
if (!allExtractedLines.length) return;
const rows = allExtractedLines.map(line => [line]);
const worksheet = XLSX.utils.aoa_to_sheet(rows);
const workbook = XLSX.utils.book_new();
XLSX.utils.book_append_sheet(workbook, worksheet, "PDF_Text");
XLSX.writeFile(workbook, "converted_from_pdf.xlsx");
});
π Result
Just upload any PDF β and youβll see its text content instantly extracted. Then click Export to Excel and
your Excel file will be downloaded automatically.
π― Conclusion
This small utility is a great example of how front-end technologies can perform powerful operations without any backend.
Itβs lightweight, secure, and a great learning project for anyone exploring JavaScript-based automation tools.
Made with β€οΈ by Nidhi Chaturvedi
```