1.) I've found that the box size equilibrates in about 5ns of NPT. Your mileage may vary.
2.) I find it convenient to have my solvent boxes always be exactly 10000 residues so the resids are 0 through 9999, which fits into the pdb format nicely. The solvent box size doesn't matter. If you need a bigger box, the solvate plugin will replicate the box for you. If you need a smaller box, the solvate plugin will cut the box down to size.
3.) That isn't how the solvate plugin works. It places solvent molecules such that they aren't too close to the solute. If the initial arrangement is too tightly packed, the simulation box will expand under NPT, and if there is too much space, the box will compact. Basically, use NPT for equilibration.
4.) Sure, why not? For mixed liquids, I've made a biphasic system, and then let them mix. It doesn't take very long if the solvents are miscible. For gases, they should explode outward pretty quickly. Indeed, if you just want to simulate gases, the solvate plugin is overkill, since your particles are going to be REALLY far apart anyway, so just initially placing things on a grid wouldn't be a bad approximation.
5.) You just need something that VMD will tag once per residue. If you monkey with your residue definitions in the topology file you are using, you can set the name of one atom in each solvent molecule to be named "SOLV" or something, and then use the key selection "name SOLV". Or just mix single-component solvent boxes through simulation.


