I believe that many developers, have done the file upload function. Simple file upload is very easy, in PHP two functions can be done:
file_exists
move_uploaded_file
The first function determines whether the file exists,convert scanned pdf to word online free large files and the second function uploads the file.
However, if the file being uploaded is a few hundred megabytes or a few gigabytes, this simple operation may not work and there may be a timeout or memory overflow.
The reason for this is that when uploading a file, the server saves the contents of the file in memory before saving it in a file. In addition, the server has a time and memory limit for a single upload request, and running out of memory or a timeout will terminate the upload request. Although these two values can also be adjusted, you are not sure how large this value is because the actual file being uploaded is unknown. In addition, if the memory limit is set too large and the timeout is set too long, when you upload a large file, the server may run out of memory and the entire service may be brought down.
So how do you solve the problem of large file uploads?
Starting, of course, with the two reasons mentioned earlier! As the file is too large, causing the request time to be too long and memory consumption too high, then we can split the large file into smaller files to upload, i.e. use sharded uploads.
First the front-end uploads the large files in fragments, then the back-end merges all the fragments in order.
We can use webploader for front-end fragmentation. JS component and for back-end we can directly use Laravel framework. Using webuploader for chunked uploads is very simple, and requires only a few configuration parameters
There are only a few parameters to configure: chunked means whether to enable chunked uploads; chunkSize means the size of the chunk, in bytes; threads means the concurrency. It should be noted that since it takes some time to calculate the md5 value of the file, we don't use automatic upload here, but split the file selection and upload into two steps.
The back-end part needs to receive several parameters file, chunks, chunks, md5, size, where md5 is used to determine the uniqueness of the file, chunks, size is used to determine whether the slice is uploaded or not.
It is also important to note that when merging files on the server side, it is best to use a stream rather than a direct file. Since streaming is a merge, it is not necessary to keep all memory in memory to avoid memory overflow when merging fragments.
In addition, considering that fragment uploads may be partially successful, a timed task needs to be set to clear temporary files to minimize server resource waste.
This solves the problem of uploading large files.
Although PHP is used here to demonstrate the server, you can still refer to this case using Java and Golang, because the principle is basically the same.