Stream upload to AWS S3 using expressjs

      No Comments on Stream upload to AWS S3 using expressjs

What is stream upload?

When you upload a file to a server, the server usually stores incoming data as a temporary file or in a buffer. This data then is saved to the real location specified in your code with a proper filename. This process is fine when uploading to a server, but when you’re trying to upload a large file to AWS S3 storage through your backend server, you will run into some problems. Since the uploaded file will need to be stored temporarily in your server, you need to have large enough memory or storage space on your server. Also, the file will need to be uploaded to the server in its entirety before it gets sent to AWS S3. Stream upload skips this temporary storage part and just uploads directly to target (in this case AWS S3).

Prerequisites

The file and/or other form data should be submitted as multipart/form-data. aws-sdk for nodejs and multiparty are installed.

npm install aws-sdk --save
npm install multiparty

Then in your .js file, require and configure aws-sdk.

const AWS = require('aws-sdk');
const multiparty = require('multiparty');

const credentials = new AWS.SharedIniFileCredentials({profile: 's3'});
AWS.config.credentials = credentials;
s3 = new AWS.S3({apiVersion: '2006-03-01'});

Note that I’m specifying “s3” profile which has been configured using shared credentials file on the server.

Multiparty code

Let’s say you have a form with text input named ‘extrafield’ and file input named ‘file1’. When this form is submitted to ‘/upload’, multiparty intercepts http requests formdata and parses it.

app.post('/upload', function(req, res) {
    const form = new multiparty.Form();

    // not a file
    form.on('field', function(name, value) {
        // check if field is named extrafield
        if (name === 'extrafield') {
            const extrafield = value;
        }
    });

    // non-field data
    form.on('part', function(part) {
        // check if it's a file
        if (part.filename) {
            // check if it's named file1
            if (part.name === 'file1') {
                // specify bucket name and key
                const params = {
                    Bucket: 'my-aws-s3-bucket',
                    Key: part.filename
                }
                let S3Obj = new AWS.S3({ params: params});
                S3Obj.upload({Body: part})
                    .on('httpUploadProgress', function(e) {
                        if (e.loaded === e.total) {
                            console.log(`File upload complete`);
                        }
                    })
                    .send(function(err, data) {
                        if (!err) {
                            console.log(`File saved to ${data.Location}`);
                        }
                    });
            }
        }
    });

    form.on('error', function(error) {
        console.log(error);
    });

    // start parsing data
    form.parse(req);
}); 

Multiparty has event handler style execution. So when form data is submitted to /upload, whenever the incoming data meets certain condition, it’ll execute the specified code.

The stream upload occurs in this part:

// specify bucket name and key
const params = {
    Bucket: 'my-aws-s3-bucket',
    Key: part.filename
}
let S3Obj = new AWS.S3({ params: params});
S3Obj.upload({Body: part})

This is done using AWS S3 upload function which uploads a stream object.

The S3 object exposes httpUploadProgress event, so you can track the upload progress with e.loaded and e.total.

If you want to save a file on disk rather than upload it to S3, then you can pipe it to fs.

const form = require('multiparty');
const fs = require('fs');

// in your app.put('/upload',... 
let fileStream = null;
const form = new multiparty.Form(); 

form.on('part', function(part) {
    if (part.filename) {
        if (part.name === 'nonS3file') {
            if (!fileStream) {
                 fileStream = fs.createWriteStream(`your-file-path/${part.filename}`);
            }
            part.pipe(fileStream);
        }
    }
});

// close the file stream
form.on('close', function() {
    if (fileStream) {
        fileStream.end();
        fileStream = null;
        console.log('nonS3file has been saved');
    } 
});

Note that you have to invoke ‘close’ on multiparty to end the streaming when saving to disk.

That’s it! No unnecessary memory wasted or temporary storage needed when uploading to S3 with your code.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.