Using the Celex aligner
Our Celex aligner is finally on-line. You can access it from the top menu.
What is Celex?
A Celex number is a unique code given to every regulation, directive or other type of EU document. For example, the Celex number of Regulation No 1024/2013 is "32013R1024", where "3" is a domain code, "2013" is the adoption year, "R" is used for regulations, and, finally, "1024" is the number of the document. You can find further details on the EUR-Lex website.
What is an alignment?
When translating a document, you often have some reference translations (together with the original text) you want to follow for consistency. Sometimes the client provides reference translations, sometimes you use your own past translations.
If you translate using translation memory software, you'll want to introduce the reference translations into your database. For that purpose, you will have to create a parallel list of original and translated sentences. That process is called alignment.
A Celex aligner?
Some translators use EU documents extensively as a reference for their translations. Those EU documents are adopted in all EU languages and in many cases they are an official terminological reference for subsequent translations in certain fields.
However, manually aligning those documents is time-consuming and prone to frustrating software crashes. As EU laws are usually consistently formatted across all languages, we have found a way to align them automatically (well, most of the time).
The basic workflow
The alignment interface consists of a simple form where you enter the Celex number of the document you want to align, the source language and the target language. After pressing "Submit", you are taken to a page where you can download the TMX alignment (if the alignment is successful) and an HTML table where you can manually refine and save your alignment. If the alignment is unsuccessful, you get only the HTML file, which you can adjust and finally export into a TMX memory.
The form has three fields and a submission button. The fields are as follows:
The Celex number field is a field where you can enter the required number. If you enter an invalid Celex number, the script will give you an error in the next screen.
The source and target languages are drop-down menus, where the language codes are ordered alphabetically.
Here's an example:
There are also some optional fields, but you can safely ignore them. At the end you'll see a big blue button:
After hitting the "Submit" button, you'll get a new page with something like that:
There's an automatically generated title with several technical details and a green or blue button with a text ("Executing" or "Waiting"). Then there's an empty table and a black box (the console) with some abstruse stuff. At the end of the process, which usually takes from 10 to 30 seconds, the table will be populated with the resulting file(s) and the console will include the error messages, if any.
The page will NOT refresh itself with the results. You have to refresh/reload it manually using the appropriate icon or the F5 key.
After reloading the page, you'll get the final results:
There'a nice green "Success" button and a list of output files. You will usually see three files in the list. First, there's the TMX file, which you can readily import into your favorite translation memory tool. Second, there's an HTML file to manually refine the alignment. And thirdly, there's a "log.txt" file, which is usually empty.
A failed alignment
Sometimes the alignment fails. In that case, the resulting web page looks different:
The console header is red and there's also a red error message telling us that the alignment failed. The list of files comprises just two files: the HTML file and the "log.txt" file, wich actually contains the red error message in the console.
The HTML file contains an imperfect alignment, which you can correct manually, usually in a couple of minutes. You can either open the file immediately in the browser or download it and open it later. Note however that, in order to properly use the file, you need to be connected to the Internet when you load it in the browser.
The manual alignment
Let's open the HTML file:
Basically, there's a big table with two columns, with some instructions at the beginning and a big green button at the end.
Let's look more carefully at the first part:
First, the Celex code and the source and target languages are available in the top left corner.
Below them, there's a big backup "Save and continue" button. If you get tired of aligning you can click it to save your work in a new HTML file. You can reopen later that file to complete the alignment.
On the right side of the header you will find a description of the buttons and other functionalities useful during the alignment.
Then comes the table. There are two columns, for the source and target languages. A source language paragraph is automatically connected with the target language paragraph in the same row. You do not have to connect them one by one.
(These two segments are already connected to each other.)
Every paragraph has four buttons, which can be used to correct the alignment. Also, you can edit directly the text by clicking into a cell:
In certain cases, the paragraph is very long and doesn't fit completely in the cell. You can use the scrollbar to the right to view the entire text.
Now, let's see what those four buttons do.
Add a new paragraph
Suppose we have the following alignment:
You can see that the paragraph with the date is absent in the target language. You can fix that by adding a new paragraph below the first paragraph in the target language column. You add a new paragraph by clicking the green button in the cell above the insertion place. This will create a new cell with an editable text "<Add text here.>".
Then you edit the text as shown before and you're done.
Delete an existing paragraph
In some cases the alignment generates useless paragraphs, which you would rather delete.
To do that, click the second button (the red one). The buttons will change into a confirmation dialog:
If you press "Cancel", nothing will happen. If you press "Delete", the respective paragraph will be deleted:
Merging two paragraphs
In some cases two paragraphs should be combined into one paragraph. To do that, use the purple button in the first paragraph.
You are asked to confirm the action.
If you click "Cancel", nothing happens. If you click "Merge", the two paragraphs will be combined.
Splitting a paragraph
You may also need to split large paragraphs like in the following example.
The first source language paragraph actually contains two separate sentences and we need to split it in order to align the text with the target language text. To do that, we use the orange button. We click it once to expand the cell and see the entire text. You'll see that the button changes too.
Now use your mouse to click wherever you want to split the cell into two separate cells. You would probably split the paragraphs between the words "alike." and "Coordination". After clicking between those words, you get the following outcome:
Now the texts are aligned.
If, for some reason, you change your mind before actually splitting the text, click again the "Split" button and everything returns to the previous state. If you change your mind after splitting the paragraph, use the "Merge" button to combine it back.
Saving the alignment
You have checked the alignment and everything seems to be OK. Now you can save your alignment in the TMX format by clicking on the big button at the end of the table.
The "Save alignment" button works only if the number of paragraphs in each column is equal. Otherwise, you'll get a message telling you to complete the alignment before trying to export it.
Also, don't forget that you can save an intermediate version of your alignment in progress, using the blue "Save and continue later" button at the top of the page.
Do NOT close or refresh/reload the alignment file without saving it with first, because you would lose everything you worked. Actually, you will get a warning if you try to do that.