Metadata-Version: 1.1
Name: goslate
Version: 1.5.0
Summary: Goslate: Free Google Translate API

Home-page: https://pypi.python.org/pypi/goslate
Author: ZHUO Qiang
Author-email: zhuo.qiang@gmail.com
License: MIT
Description: Goslate: Free Google Translate API
        ##################################################
        
        .. contents:: :local:
        
        ``goslate`` provides you *free* python API to google translation service by querying google translation website.
        
        It is:
        
        - **Free**: get translation through public google web site without fee
        - **Fast**: batch, cache and concurrently fetch
        - **Simple**: single file module, just ``Goslate().translate('Hi!', 'zh')``
        
        
        Simple Usage
        ==============
        
        The basic usage is simple:
        
        .. sourcecode:: python
        
         >>> import goslate
         >>> gs = goslate.Goslate()
         >>> print gs.translate('hello world', 'de')
         hallo welt
        
         
        Installation
        ===============
        
        goslate support both Python2 and Python3. You could install it via:
        
        
        .. sourcecode:: bash
          
          $ pip install goslate
        
         
        or just download `latest goslate.py <https://bitbucket.org/zhuoqiang/goslate/raw/tip/goslate.py>`_ directly and use
        
        ``futures`` `pacakge <https://pypi.python.org/pypi/futures>`_ is optional but recommended to install for best performance in large text translation task.
        
         
        Proxy Support
        ===============
        
        Proxy support could be added as following:
        
        .. sourcecode:: python
        
         import urllib2
         import goslate
        
         proxy_handler = urllib2.ProxyHandler({"http" : "http://proxy-domain.name:8080"})
         proxy_opener = urllib2.build_opener(urllib2.HTTPHandler(proxy_handler), 
                                             urllib2.HTTPSHandler(proxy_handler))
                                             
         gs_with_proxy = goslate.Goslate(opener=proxy_opener)
         translation = gs_with_proxy.translate("hello world", "de")
         
         
        Romanlization
        ====================
        
        Romanization or latinization (or romanisation, latinisation), in linguistics, is the conversion of writing from a different writing system to the Roman (Latin) script, or a system for doing so.
        
        For example, pinyin is the default romanlization method for Chinese language.
        
        You could get translation in romanlized writing as following:
        
        .. sourcecode:: python
        
         >>> import goslate
         >>> roman_gs = goslate.Goslate(writing=goslate.WRITING_ROMAN)
         >>> print roman_gs.translate('China', 'zh')
         Zhōngguó
          
        
        You could also get translation in both native writing system and ramon writing system
        
        .. sourcecode:: python
        
         >>> import goslate                
         >>> gs = goslate.Goslate(writing=goslate.WRITING_NATIVE_AND_ROMAN)
         >>> print gs.translate('China', 'zh')
         ('中国', 'Zhōngguó')
        
         
        You could see the result will be a tuple in this case: ``(Translation-in-Native-Writing, Translation-in-Roman-Writing)``
        
        Language Detection
        ====================
        
        Sometimes all you need is just find out which language the text is:
        
        .. sourcecode:: python
        
         >>> import golsate
         >>> gs = goslate.Goslate()
         >>> language_id = gs.detect('hallo welt')
         >>> print language_id
         'de'
         >>> print gs.get_languages()[language_id]
         'German'
        
        
        Concurrent Querying 
        ====================
        
        It is not necessary to roll your own multi-thread solution to speed up massive translation. Goslate already done it for you. It utilizes ``concurrent.futures`` for concurent querying. The max worker number is 120 by default. 
        
        The worker number could be changed as following:
        
        .. sourcecode:: python
        
         >>> import golsate
         >>> import concurrent.futures
         >>> executor = concurrent.futures.ThreadPoolExecutor(max_workers=200)
         >>> gs = goslate.Goslate(executor=executor)
         >>> it = gs.translate(['text1', 'text2', 'text3'])
         >>> list(it)
         ['tranlation1', 'translation2', 'translation3']
        
         
        It is adviced to install ``concurrent.futures`` backport lib in python2.7 (python3 has it by default) to enable concurrent querying. 
        
        The input could be list, tuple or any iterater, even the file object which iterate line by line
        
        .. sourcecode:: python
        
         >>> translated_lines = gs.translate(open('readme.txt'))
         >>> translation = '\n'.join(translated_lines)
        
         
        Do not worry about short texts will increase the query time. Internally, goslate will join small text into one big text to reduce the unnecessary query round trips.
         
         
        Batch Translation
        ====================
        
        Google translation does not support very long text, goslate bypass this limitation by split the long text internally before send to Google and join the mutiple results into one translation text to the end user. 
        
        .. sourcecode:: python
        
         >>> import golsate
         >>> with open('the game of thrones.txt', 'r') as f:
         >>>     novel_text = f.read()
         >>> gs = goslate.Goslate()
         >>> gs.translate(novel_text)
        
        
        Performance Consideration
        ================================
        
        Goslate use batch and concurrent fetch aggresivelly to achieve maximized translation speed internally.
        
        All you need to do is reducing API calling times by utilize batch tranlation and concurrent querying.
        
        For example, say if you want to translate 3 big text files. Instead of manually translate them one by one, line by line:
        
        .. sourcecode:: python
        
         import golsate
         
         big_files = ['a.txt', 'b.txt', 'c.txt']
         gs = goslate.Goslate()
         
         translation = []
         for big_file in big_files:
             with open(big_file, 'r') as f:
                 translated_lines = []
                 for line in f:
                     translated_line = gs.translate(line)
                     translated_lines.append(translated_line)
             
                 translation.append('\n'.join(translated_lines))
         
                 
        It is better to leave them to Goslate totally. The following code is not only simpler but also much faster (+100x) :
        
        .. sourcecode:: python
        
         import golsate
         
         big_files = ['a.txt', 'b.txt', 'c.txt']
         gs = goslate.Goslate()
         
         translation_iter = gs.translate(open(big_file, 'r').read() for big_file in big_files)
         translation = list(translation_iter)
         
         
        Internally, goslate will first adjust the text to make them not so big that do not fit Google query API nor so small that increase the total HTTP querying times. Then it will use concurrent query to speed thing even further.
         
        
        Lookup Details in Dictionary
        ================================
        
        If you want detail dictionary explaination for a single word/phrase, you could
        
        .. sourcecode:: python
        
         >>> import goslate
         >>> gs = goslate.Goslate()
         >>> print gs.lookup_dictionary('sun', 'de')
         [[['Sonne', 'sun', 0]],
          [['noun',
            ['Sonne'],
            [['Sonne', ['sun', 'Sun', 'Sol'], 0.44374731, 'die']],
            'sun',
            1],
           ['verb',
            ['der Sonne aussetzen'],
            [['der Sonne aussetzen', ['sun'], 1.1544633e-06]],
            'sun',
            2]],
          'en',
          0.9447732,
          [['en'], [0.9447732]]]
        
        
        There are 2 limitaion for this API:
        
        * The result is a complex list structure which you have to parse for your own usage
        
        * The input must be a single word/phase, batch translation and concurrent querying are not supported
        
        
        Query Error
        ==================
        
        If you get HTTP 5xx error, it is probably because google has banned your client IP address from transation querying.
        
        You could verify it by access google translation service in browser manully.
        
        You could try the following to overcome this issue:
        
        * query through a HTTP/SOCK5 proxy, see `Proxy Support`_
        
        * using another google domain for translation: ``gs = Goslate(service_urls=['http://translate.google.de'])``
        
        * wait for 3 seconds before issue another querying
          
          
        API References 
        ================================
        
        please check `API reference <http://pythonhosted.org/goslate/#module-goslate>`_
         
        
        Command Line Interface
        ==============================
        
        ``goslate.py`` is also a command line tool which you could use directly
            
        - Translate ``stdin`` input into Chinese in GBK encoding
        
          .. sourcecode:: bash
          
             $ echo "hello world" | goslate.py -t zh-CN -o gbk
        
        - Translate 2 text files into Chinese, output to UTF-8 file
        
          .. sourcecode:: bash
          
             $ goslate.py -t zh-CN -o utf-8 source/1.txt "source 2.txt" > output.txt
        
             
        use ``--help`` for detail usage
             
        .. sourcecode:: bash
          
           $ goslate.py -h
             
             
        How to Contribute
        ==================
        
        - Report `issues & suggestions <https://bitbucket.org/zhuoqiang/goslate/issues>`_
        - Fork `repository <https://bitbucket.org/zhuoqiang/goslate>`_
        - `Donation <http://pythonhosted.org/goslate/#donate>`_
        
        What's New
        ============
        
        1.5.0
        ----------
        
        * Add new API ``Goslate.lookup_dictionary()`` to get detail information for a single word/phrase, thanks for Adam's suggestion
          
        * Improve document with more user scenario and performance consideration
        
        
        1.4.0
        ----------
        
        * [fix bug] update to adapt latest google translation service changes
        
        
        1.3.2
        ----------
        
        * [fix bug] fix compatible issue with latest google translation service json format changes
        
        * [fix bug] unit test failure
        
        
        
        1.3.0
        ---------
        
        * [new feature] Translation in roman writing system (romanlization), thanks for Javier del Alamo's contribution.
          
        * [new feature] Customizable service URL. you could provide multiple google translation service URLs for better concurrency performance
        
        * [new option] roman writing translation option for CLI
          
        * [fix bug] Google translation may change normal space to no-break space
        
        * [fix bug] Google web API changed for getting supported language list
        
Keywords: google translation i18n l10n
Platform: UNKNOWN
Classifier: Development Status :: 5 - Production/Stable
Classifier: Environment :: Console
Classifier: Environment :: Web Environment
Classifier: Intended Audience :: Developers
Classifier: License :: OSI Approved :: MIT License
Classifier: Topic :: Software Development :: Internationalization
Classifier: Topic :: Software Development :: Localization
Classifier: Topic :: Text Processing :: Linguistic
