This question has been addressed on OpenData SE, it might give interesting pointers:
Suppose that I have some sort of specialized data, perhaps that I've collected myself or been a part of the collection. And suppose that
nothing prevents me from handing this data out to people. In what
method should I go about distributing/storing this data so that others
will be able to find it and use it, whenever this time may be?
Targeting specialised repositories as per @Joe's answer is indeed an
excellent way to go about disseminating data, but what if no such
specialised repository exists or you do not wish to target only one
specific community in particular?
A methodology to expose Open Data using generic principles is the
5-star Open Data scheme originally proposed by Tim Berners-Lee
The core rationale of 5-star Open Data is that you make your data more
easily accessible, processable and interoperable with each successive
★ Put your data on the Web in some format with an Open Licence.
People can access it through their browsers and spend some time to
figure out how they can download/access/process/use it. (Avoid
problems for your client like this.)
★★ Put your data in a machine-processable format. For example,
having a table in Excel is better than having a snapshot printed in
PDFs or images because people can download it and start running
experiments over it. (Avoid problems like this.)
★★★ Use non-proprietary formats. For example, providing data as a
CSV is often better than as an Excel file because CSV can be directly
processed by a wider range of (free/open source) tools and programming
languages. (Can't find anyone complaining about Excel on here yet
but, e.g., this is a similar problem.)
★★★★ Use URIs to denote things. For example, let's say you provide
a bunch of pollution measures for cities and somebody would like to
specifically reference the pollution measure for London. Assigning a
URI for London in your local data provides a global unique identifier
for that city that people can reference and point to. There are, for
example, related proposals for embedding URI fragment identifiers in
CSV files. (Avoid problems like this or this.)
★★★★★ Link your data to other data to provide context. So you have
created a URI for London in your data and people can point to it.
However, which London are you referring to? London, England or
London, Ontario? If you link your local URI for London to the
Wikipedia page about the London to which you refer (or, even better,
to the DBpedia URI for the specific place to which you prefer),
this provides context as to what you mean. (Avoid problems like
The shift from ★★★ to ★★★★(★) is quite an ambitious one and technical
proposals are still being made on how best to achieve this, but five
star Open Data is great because now your data are available on the Web
under open licences with open structured formats where everything of
importance is given a URI that can be referenced and linked across the
Web, allowing for future discovery and re-use. A common methodology to
create five star Open Data (again proposed by Tim Berners-Lee) is
Linked Data, which assumes RDF as a common interoperable
data format. But if that all sounds too much, getting as far as ★★★
data is still great.
Again, you can check out this description of 5 Star Open Data
for more information and a related question here.
A useful resource for the generic cataloguing of Open Datasets is the
CKAN project, where the related DataHub repository is a
great place to list and publicise your dataset. You can check out a
bunch of 5-star Open Datasets here.