workaround for checking tags in table
Thread poster: Bjørnar Magnussen
Bjørnar Magnussen
Bjørnar Magnussen  Identity Verified
Local time: 02:55
English to Norwegian
+ ...
Aug 30, 2012

Hi there

I have a long bilingual table with strings such as

EN: <h1>Automatic renewal</h1>
NO: <h1>Automatisk fornyelse</h1>

I need a way to ensure that all rows in both columns contain identical tags (i.e. the text within the angle brackets). Does there exist a QA tool that will take care of this? Or do you have other workaround suggestions?

Thanks in advance, Bjornar

[Edited at 2012-08-30 14:43 GMT]

[Edited
... See more
Hi there

I have a long bilingual table with strings such as

EN: <h1>Automatic renewal</h1>
NO: <h1>Automatisk fornyelse</h1>

I need a way to ensure that all rows in both columns contain identical tags (i.e. the text within the angle brackets). Does there exist a QA tool that will take care of this? Or do you have other workaround suggestions?

Thanks in advance, Bjornar

[Edited at 2012-08-30 14:43 GMT]

[Edited at 2012-08-30 14:44 GMT]

[Edited at 2012-08-30 14:45 GMT]

[Edited at 2012-08-30 14:45 GMT]

[Edited at 2012-08-30 14:46 GMT]
Collapse


 
Joakim Braun
Joakim Braun  Identity Verified
Sweden
Local time: 02:55
German to Swedish
+ ...
HTML validator Aug 30, 2012

Not a great solution, but: Make an HTML document and run it through an HTML validator or web browser with error checking. (Unbalanced tags will probably show visually, too.)

http://validator.w3.org/


 
Bjørnar Magnussen
Bjørnar Magnussen  Identity Verified
Local time: 02:55
English to Norwegian
+ ...
TOPIC STARTER
But this tool is for checking markup validity? Aug 30, 2012

Joakim Braun wrote:

Not a great solution, but: Make an HTML document and run it through an HTML validator or web browser with error checking. (Unbalanced tags will probably show visually, too.)

http://validator.w3.org/


I need to check that both columns contain identical tags (see example above), not to check markup validity.


 
Joakim Braun
Joakim Braun  Identity Verified
Sweden
Local time: 02:55
German to Swedish
+ ...
Sorry Aug 30, 2012

Sorry for your time, the penny dropped now...

I can, however, tell you how to write brackets in this forum: Use the HTML entity codes (&lt; for left bracket, &gt; for right bracket... <tag>)

[Bearbeitet am 2012-08-30 14:40 GMT]


 
Bjørnar Magnussen
Bjørnar Magnussen  Identity Verified
Local time: 02:55
English to Norwegian
+ ...
TOPIC STARTER
Thanks for your tips Aug 30, 2012

I have changed the brackets in my original post. Hope it's clearer now.

 
Joakim Braun
Joakim Braun  Identity Verified
Sweden
Local time: 02:55
German to Swedish
+ ...
Excel: Tag extraction Aug 30, 2012

In Excel you could extract the tags into separate cells. Then you can easily compare EN and NO tags with a suitable formula.

Text is in A1.
Extract opening tag.
=MID(A1;1;FIND(">";A1))

Assume result of above function is stored in C1.
Extract text following opening tag.
= MID(A1;LEN(C1) + 1; LEN(A1) - LEN(C1))

Assume result of above function is stored in D1.
Extract closing tag.
= MID(D1; FIND("<";D1); LEN(D1) - FIN
... See more
In Excel you could extract the tags into separate cells. Then you can easily compare EN and NO tags with a suitable formula.

Text is in A1.
Extract opening tag.
=MID(A1;1;FIND(">";A1))

Assume result of above function is stored in C1.
Extract text following opening tag.
= MID(A1;LEN(C1) + 1; LEN(A1) - LEN(C1))

Assume result of above function is stored in D1.
Extract closing tag.
= MID(D1; FIND("<";D1); LEN(D1) - FIND("<";D1) + 1)

In this way we can create columns with extracted opening and closing tags.

---

Now, assuming you have EN tag in F1 and NO tag in G1, you could set this formula for H1:
=IF(F1=G1;"";"Bad")

This will mark all rows with variant tags with "Bad".

(HTML tags are case insensitive, but this function may not be - haven't tested)

---

I think you can take it from here.
(With very little tinkering you could automate the tagging of the target column.)

[Bearbeitet am 2012-08-31 20:03 GMT]
Collapse


 
Samuel Murray
Samuel Murray  Identity Verified
Netherlands
Local time: 02:55
Member (2006)
English to Afrikaans
+ ...
Many/most CAT tools can do it Aug 30, 2012

Bjørnar Magnussen wrote:
I need a way to ensure that all rows in both columns contain identical tags (i.e. the text within the angle brackets). Does there exist a QA tool that will take care of this?


Well, many CAT tools can do it, but you need to preprocess your text a bit and learn how to use the CAT tool, and that will take some explaining.

If I had to do it, I would convert the table to an uncleaned RTF file, change the style of the tags to tw4winInternal, and tell Wordfast Classic to check for tag errors. However, getting to that point is not simple if you know nothing of Wordfast

Another way to do it is by using the pofilter tool, if you can convert your table into a PO file. If you can convert your table to a three-column Excel file (column A with just nonsense text, column B with the source text and column C with the target text), you can use csv2po to create a PO file. You can then use pofilter (using the "xmltags" check) to check the files. But again, this is fairly advanced stuff for someone who hasn't done it before (and differences in UTF8 and ANSI will drive you up the wall).

http://sourceforge.net/projects/translate/files/Translate%20Toolkit/1.9.0/
http://translate.sourceforge.net/wiki/toolkit/csv2po
http://translate.sourceforge.net/wiki/toolkit/pofilter
http://translate.sourceforge.net/wiki/toolkit/pofilter_tests

Samuel


 
Bjørnar Magnussen
Bjørnar Magnussen  Identity Verified
Local time: 02:55
English to Norwegian
+ ...
TOPIC STARTER
Thanks a lot, Samuel and Joakim Aug 30, 2012

I will try out your suggestions first thing tomorrow!

Regards, Bjornar


 
Joakim Braun
Joakim Braun  Identity Verified
Sweden
Local time: 02:55
German to Swedish
+ ...
Like this Aug 30, 2012

This was a fun little problem. I made an Excel sheet that automates tag extraction, text extraction and transfer of tags.
www.jfbraun.com/tagtest.xlsx.zip


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
Nice solution! Sep 9, 2012

Joakim Braun wrote:

This was a fun little problem. I made an Excel sheet that automates tag extraction, text extraction and transfer of tags.
www.jfbraun.com/tagtest.xlsx.zip


Hi Joakim, thanks for diving into this.

I can use this solution for translating error messages in Excel too:

<text>Nuclear meltdown! Home system is shutting off in 5 seconds ...</text>

Hans

[Bearbeitet am 2012-09-09 06:36 GMT]


 
Hans Lenting
Hans Lenting
Netherlands
Member (2006)
German to Dutch
LineFeeds are being deleted Sep 10, 2012

Hi Joakim,

I just noticed that linefeeds are being removed from the text messages ...

Cheerio,

Hans


 
István Hirsch
István Hirsch  Identity Verified
Local time: 02:55
English to Hungarian
Xbench Sep 10, 2012

Load the file in a suitable form (for example, as tab delimited file) into Xbech for analysis, it will display the problematic places as Tag mismatch or Numeric mismatch.

 


To report site rules violations or get help, contact a site moderator:


You can also contact site staff by submitting a support request »

workaround for checking tags in table






Trados Studio 2022 Freelance
The leading translation software used by over 270,000 translators.

Designed with your feedback in mind, Trados Studio 2022 delivers an unrivalled, powerful desktop and cloud solution, empowering you to work in the most efficient and cost-effective way.

More info »
CafeTran Espresso
You've never met a CAT tool this clever!

Translate faster & easier, using a sophisticated CAT tool built by a translator / developer. Accept jobs from clients who use Trados, MemoQ, Wordfast & major CAT tools. Download and start using CafeTran Espresso -- for free

Buy now! »