GPT-3 combines text processing with image recognition/GPT-3 combina el procesamiento de texto con el reconocimiento de imágenes

Source
A few days ago we briefly mentioned in a post the excellence of GPT-3, a natural language processing system capable of confusing the people with whom it interacts as it increasingly resembles a human, based on the coherence of the responses that gives us our questions.

Hace unos días mencionábamos someramente en un post las excelencias de GPT-3, un sistema de procesamiento del lenguaje natural capaz de confundir a las personas con las que interactúa pues cada vez se parece más a un humano, basándonos en la coherencia de las respuestas que nos da a nuestras preguntas.

But as much as it impresses and confuses us, GPT-3 does not know what it is saying, it simply answers our questions by comparing them with huge amounts of text collected on social networks, wikipedia, reddit, etc. and looking for the answer with the best percentage of correctness presents.

Pero por mucho que nos impresione y nos confunda, GPT-3 no sabe lo que está diciendo, simplemente responde a nuestras preguntas cotejándolas con ingentes cantidades de texto recolectado en redes sociales, wikipedia, reddit, etc y buscando la respuesta que mejor porcentaje de acierto presenta.

Any human being is capable of visualizing with images what he is dealing with and understanding the meaning and essence of what he sees expressed with a set of symbols that we call letters subject to a defined set of grammar rules, that is, we understand what we say or read. (almost every)

Cualquier ser humano es capaz de visualizar con imágenes aquello de lo que se está tratando y entender el sentido y la esencia de aquello que ve expresado con un conjunto de símbolos a los que llamamos letras sujetos a un conjunto definido de reglas gramáticales, es decir, entendemos aquello que decimos o leemos. (casi todos)

Source
But this can change with two new models designed by OpenAI called DALL-E and CLIP that will be coupled to the GPT-3 model to provide it with the ability to understand and visualize the meaning of the semantic content that it usually handles.

Pero esto puede cambiar con dos nuevos modelos diseñados por OpenAI llamados DALL-E y CLIP que se acoplarán al modelo GPT-3 para dotarle de la capacidad de entender y visualizar el significado de los contenidos semánticos que habitualmente maneja

CLIP (Contrastive Language-Image Pre-training) is an image recognition system trained to recognize images found on the internet and apply a caption to them from among more than 30,000 stored and can link lots of objects with their name and the words that describe them.

CLIP (Contrastive Language-Image Pre-training) es un sistema reconocimiento de imágenes entrenado para reconocer las imágenes que encuentra en internet y aplicarles un pie de foto de entre más de 30.000 almacenados y puede enlazar montones de objetos con su nombre y las palabras que los describen.

Source
DALL-E, on the other hand, is trained to create images from a description, in the header photo we see an example in which DALL-E was introduced to the phrase "Snail made of harp" and, as you may have seen in the photo, has been quite successful drawing harp-shaped snails.

DALL-E por el contrario está entrenado para crear imágenes a partir de una descripción , en la foto de cabecera vemos un ejemplo en el que a DALL-E se le introdujo la frase "Caracol hecho de arpa" y, como habréis podido apreciar en la foto, ha estado bastante acertado dibujando caracoles con forma de arpa.

Combining these two modules with the intrinsic capabilities of the GPT-3 model, we will achieve a model capable of understanding and visualizing what it is talking about, which, I don't know about you, but to me, it seems that it is more than what most of us humans get.

Combinando estos dos módulos con las capacidades intrínsecas del modelo GPT-3 conseguiremos un modelo capaz de entender y visualizar aquello de lo que está hablando lo cual, no sé a vosotros pero a mí, me parece que es más de lo que conseguimos la mayoría de los humanos.

More information/Más información
https://www.technologyreview.com/2021/01/05/1015754/avocado-armchair-future-ai-openai-deep-learning-nlp-gpt3-computer-vision-common-sense/

Sort:

Trending

[-]

hivewatcher (67) 3 years ago

Warning! This user is on our black list, likely as a known plagiarist, spammer or ID thief. Please be cautious with this post!
If you believe this is an error, please chat with us in the #appeals channel in our discord.

$0.04

elmundodexao (67) 3 years ago

^{Hola @mauromar, sería más genial que hicieran algo así para nosotros ser más inteligentes…

Le deseo un lindo día…}

$0.03

mauromar (73) 3 years ago

Para muchos ya es tarde. ;-D

$0.00

leynedayana (52) 3 years ago

Hola @mauromar, feliz inicio de año… Interesante información aun cuando esto evidencia que nos estamos quedando obsoletos… ;-)

$0.02

Pues sí amiga mía, la verdad que la cosa avanza demasiado rápido.

santoninoatocha (51) 3 years ago

Esperemos que nos acerque amigo @mauromar, pero pareciera que la tecnología terminara alejándonos de nuestra humanidad.

Al final las máquinas acabarán con la humanidad, ya sea que nos exterminen o que nos fundamos con ellas.

GPT-3 combines text processing with image recognition/GPT-3 combina el procesamiento de texto con el reconocimiento de imágenes

Hola @mauromar, sería más genial que hicieran algo así para nosotros ser más inteligentes… Le deseo un lindo día…

Hola @mauromar, sería más genial que hicieran algo así para nosotros ser más inteligentes…

Le deseo un lindo día…