Search

Perhaps this could be modified to use microtime() instead, or a combination of microtime() plus a random number, so reduce the risk of this occurring?

I'd be afraid of it making filenames even more scary. Maybe the field could use a different technique altogether, like incrementing a number if an actual collision was found?

Agreed, appending a -n to the end of the file would make more sense. Like the Unique Text Input field when creating unique handles.

I am afraid that Nick's problem (having two fields in one section) can not be solved that way, because upon saving the upload field doesn't "see" another field's attempt to create the same filename.

Actually using the timestamp has lost its benfit anyway. I invented this logic at a time when Symphony didn't delete the old file when a new file was uploaded. So the filename created by the extension was supposed to help people find the right (last) file in the filesystem in case they needed to.

Nowadays Symphony will keep only one version of a file (for each upload field). So my proposal is to use PHP's native uniqid() function. It should reliably solve Nick's issue (even without adding any extra-stuff), and it's dead-simple. (I like simple things.)

What do you think, Nick and Nils?

Like the Unique Text Input field when creating unique handles.

That doesn't work in your case, Nick! Try and build a section with two Unique Text Input fields. That won't work as expected.

BTW: Here's a little "uniqueness test" for PHP's function:

<?php

    echo('<pre>');
    echo uniqid();
    print('
');
    echo uniqid();

Just save it and call it in your browser. You won't get the same value for both uniqids, because there are some microseconds in between.

I think that incrementing a number would work. The two fields work independently — when writing the file to disk they would check whether a file of that name exists in the directory. If it does, then the field increments the filename until a unique name is generated. Then Symphony moves on to the next field in the section.

True, uniqid() would work, but as Nils rightly points out, a random string at the end of the filename is pretty ugly. Users are more accustomed to an incremental number, which is what many OSs do.

Nick, I agree that it could work. But the extension would have to rebuild the whole checkPostFieldData() function of the standard File Upload field. Which makes it harder to maintain.

Instead I'd prefer including the "uniqueness" functionality in the core field. (I proposed the same for the Text Input field in a Working Group discussion).

So my proposal is: For the moment I will make it use uniqid() (which will solve your issue, at least). In the coming weeks we try and make this extension superfluous by pimping the core File Upload field.

What about that?

But the extension would have to rebuild the whole checkPostFieldData()

Without having had a look at your code: Can't you do your magic, then call $parent->checkPostFieldData();?

I will take a look into it at the weekend. And probably initiate a working group discussion about the core field(s) which might provide this functionality directly...

There is a discussion about this already. Symphony 3 will have some of these additional "necessities" on core fields, but I think there's a reluctance to continue back-porting features to Symphony 2. The Unique Upload field has established itself as "the way" to achieve this in Symphony 2, so it makes sense for it to explore the functionality, in my opinion.

@phogue, @nickdunn: I thought about your suggestions for a while. Yes, having a "counting logic" for filenames would indeed be possible without re-inventing the wheel. But while I love this logic for any kind of handles (especially if they will be part of URLs) , I don't think that it's the right way to go for filenames.

I'll try and explain why.

Imagine a website having several upload sections. Someone is uploading two versions of the company's "Terms and Conditions" PDF to two separate sections. So now you can download two versions of this PDF having the same filename - hmmm, that's not good. But it gets even worse when you replace the PDFs by newer versions. You might end up having a "terms-and-conditions-7.pdf" in one section which is actually older than "terms-and-conditions-5.pdf" in the other section. Isn't the filename suggesting that #7 is the newer file?

So the "handle logic" which you suggested would make filenames much less unique. Indeed they would only be unique in a single upload folder. Once files are downloaded, they will loose their "uniqueness" (because only the filename remains). Even worse, if filenames are different, they may suggest the wrong "time order" to the user.

The "uniqid() logic" which I proposed will handle the above case much better:

  • There won't be two files with the same name. This also means that you exactly know which file has been donwloaded when you know the filename.
  • The filenames will not immediatley suggest a "versioning" to the user (which is much better than suggesting the wrong thing). But still, AFAIK, when ordering files alphabetically the operating system will correctly sort 'em because the uniquid() function returns a HEX encoded microtime.

What do you think?

Nick, Nils? Any thoughts?

Sorry, I forgot this was directed at me.

I see your "versioning" argument, but feel it's quite a contrived example. For example if you did have users uploading the same T&Cs into the CMS into two separate locations then I'd suggest this is an architecture/design problem instead. But that's not really relevant.

I agree that a unique string appended to the file guarantees uniqueness. The primary benefit of the numeric increments however, is that the first file retains its original clean file name. Only when you try to upload a second file of the same name do you see the ugly unique file name. Would there be a way to only append the unique string if a naming conflict exists? My worry is that a file name of terms-and-conditions-ff61e2.pdfis ugly and potentially confusing for the user and we should be trying to avoid it whenever possible (i.e. only appending it if required to avert a conflict).

Another possibility is that the field creates a unique folder to place the files in. Each uploaded file would result in a folder containing just one file (architectural ugliness) but with the benefit of retaining the original desired file name (user experience friendliness).

Just thinking outside of the box: isn't it possible to set a different filename with a header?

That way, the server could retain the unique filename by appending a unique string, but the user would always download a file with the original name. Thus letting the OS handle duplicate filenames the way the user is used to it.

2 birds in one stone?

My worry is that a file name of terms-and-conditions-ff61e2.pdf is ugly and potentially confusing for the user

Have you ever had any complaints with the current implementation (which is not better in this respect)?

isn't it possible to set a different filename with a header?

Yes, but again this complicates things. We would have to guess a "beautiful" filename (or save the original upload name, which might be ugly itself).

The only argument against my dead-simple solution is "beauty in filenames" -- isn't this a programmer's idea? Are you really sure that users/visitors care about this?

Honestly speaking, nobody ever complained about the filenames of the Unique Upload Field. Having experienced a uniqueness issue we are suddenly talking about solving it in a completely different way (which eventually might reduce uniqueness). This is really nerdy, isn't it?

Would it not be beneficial to use unique hashed filenames for each upload, then save a lookup to the original name in a db table? That way when a user downloads a file, it could be matched to the original name which is what the user downloads?

I've only really been skim reading this thread, so apologies if it's a useless suggestion...

i use michael's extension almost exclusively for every upload operation i implement when it comes to site assets (galleries, sliders, etc). however, in the example presented ("terms and conditions"), i think it's a valid point to consider better architecture. i hate to take the stance of 'make the user do it', but i don't think it's completely inappropriate to use the generic upload field and suggest that files be given specific names when talking about downloadables.

@designermonkey: Yes, you are right, and similar ideas have been proposed. My only problem is: We are talking about different understandings of "uniqueness". To me those proposed solutions do not provide uniqueness. I think a filename must be seen outside of the path context (which is why you can't compare it to handles, which are always in a path context). Up to now the extension is "branding" the filename, thus making it unique in every context, no matter if it has been downloaded or is moved on the server. I like this kind of "branding", and my users have never complained about it.

But I have the feeling that everything has been said. So maybe we'll end up "forking" this baby. I wouldn't like to build two different filename functionalities into one extension.

Unique Upload Field updated to version 1.4 on 1st of June 2011

I updated the logic for unique filenames to use PHP's uniqid() function instead of the UNIX time. This will definitely solve problems with two files having the same name being uploaded at the same time.

At the moment I don't have the time to think about the proposals to create "less unique, but better looking" filenames. Maybe we can come back to this later.

Create an account or sign in to comment.

Symphony • Open Source XSLT CMS

Server Requirements

  • PHP 5.3-5.6 or 7.0-7.3
  • PHP's LibXML module, with the XSLT extension enabled (--with-xsl)
  • MySQL 5.5 or above
  • An Apache or Litespeed webserver
  • Apache's mod_rewrite module or equivalent

Compatible Hosts

Sign in

Login details