The problem with mathematics is simple: notations. It make everything hard. Specially for the student that freshly joins the university and has to deal with a lot of similar yet incompatible notations and corresponding way of thinking. There are a lot of loosely defined conventions (when not an article's author ones) that make understanding mathematical text .
For instance:
f: is it a function, is it a scalar?
a: is it a function, is it a scalar?
What a strange notation system where letter of item of the same set have implictly different type, unless explicitly stated otherwise.
For instance:
f: should be a function
a: should be a scalar, or a vector
m : shoud be an integer index
if the value of the character is so important to its type then why changing case, changes types, but in unrelated and incompatible ways?
For instance:
F: integral of f
F: set named F
F: function named F
A : matrix
A : vector
A : set
M : matrix (and only a matrix)
What a strange notation system, where a space and lack of it have a signification?
For instance:
aa = a*a
a a = a*a
What a strange notation system where symbols have very different meanings:
For instance, what do you read here?
f(a) : apply the function f with the parameter a
f(a) : multiply the scalar f with scalar a
f(a) : multiply scalar f with vector a
f^a : scalar f to the power a
f^a : function f in a set of family of functions indexed by a?
f^a : scalar noted f^a
What a strange language where the semantics of operation can change:
For instance:
f(a) : apply the function f with argument a
f(a) : declare a predicate f for argument a
f(a) = : define the value of f for a variable called a (unbound variable, ie generic)
f(a) = : define the value of f for a value called a (bound variable, ie given)
Maths are hard to understand because mathematical notation are notation, they are not a language. Which means that with every field, subfield and individual research paper tou have to learn from scratch what the scrible you read mean.
In a sens they are a spreceise a writing "todo: buy stuff" on piece of paper: when you'll read it you'll have to struggle a lot to understand what the author meant.
That's too bad to crush the opportunity of the masses to understand an use mathematics on the whim of using notations. Like the dreadful conventional direction of current in electrical engineering... which is like calling a fish a pig an then trying to understand how do pig swim.
A great source of confusion come from the fact that sometimes letter denote vectors rather than scalar, and operations apply differently to them. Because summation and combination operations are implicit, it makes it more difficult to understand. For instance when considering matrices, it is important to know it it is a product or a tensorial product that is applied.
When multiplying two letters, possibly:
- we multiply a vector by a scalar which yields a vector,
- we multiply a line vector by a column vector, which yields a scalar
- we multiply a column vector by a line vector, which is an error
- we multiply a matrix and a vectors, or a matrix and a scalar, etc.
06/02/2015
26/11/2014
Cunning and Crazy Energy Management through Simple Physics Tricks
I am amazed by this: half the world spends energy cooling while at the exact same the other half of the world spends energy warming
There must be something to do about this!
Here are my not-serious-but-why-not proposals to deal and additional ideas about energy efficiency.
They mainly revolves around 4 principles: energy storage, energy transfer, energy shielding and energy production.
* build North-South bidirectionnal heat pipelines
It is like building very long aquaducs, up to a quarter of the diameter of Earth. The heath-transfer fluid gets warmed in the tropics while it cools them, then goes back towards the poles where its get cooled while warming the places.
But will it consume less energy pumping heath in the pipe than pumping in and out it locally how it is currently done? If the same principle of aquaducs can be exploited maybe not.
Will the transfer be so slow that heath disipate in the pipes before reaching destination?
* build East-West bidirectional electricity pipelines
if not for heath, the same principle could be applied with solar electricity: it should be captured on the sun-facing side of Earth and release on the other side of Earth. This way solar energy is also usable at night.
two mains issues: the energy dispersion along lines thousand km long, and the current low efficiency of solar production.
Have water stored in an undergound thermos.
The circuit is the following: themos > heat disipator (inside building) > heat collector (outside building) > thermos.
During the day cool water is pumped up to the disipator where it cools the building, then it goes to the heat collector when it is warm, where it gets warmer, then it goes underground to the thermos.
During the night warm water is pumped to the disipator where it heats the building, then it goes to the heat collector when it gets chiller, where it gets colder, then it goes to underground to the thermos.
the fluid could be any heath-transfert fuild,
the disipator could be standard radator
the collector could be solar panels
usuable only when there are great difference between day/night temperatures. i.e. in summer and desertic areas and unusable in winter.
* collect cold in winter, release it in summer, collect heath in summer, release it in winter
same idea has previous one but on much larger scales, both time-wise and dimension-wise.
* build with smartly-designed material that can disipate heath quickly and collect it difficultly and vice versa
e.g. metal sheets folded in a fractal patterns: it has lots of surface in a small space
for insulation: the longer the heath has to travel to get to the actual wall the harder it can warm it. A metal brick build in such a way would provide the largest possible paths for heath while still being sturdy.
for dissipation: the larger the surface of the disipator the more easily heath can be pumped out of the building.
* build thermos buildings
the building will be a large thermos bottle with windows in it. The layer of empty space between the two walls will make heat difficult to enter the building in summer and to exit the building in winter.
In order to prevent risk from large implosions, make small cells of vaccum.
* build underground buildings
no sun no heath, no wind no snow no rain no cold, stable temperature throughout the year, but no natural light also. Yet energy-wise that's good. Issue being that is not interesting to build "earth-scrapper", the opposite of sky-scrappers, because the deeper one get the hotter it gets, and then AC would need to be use throughout the year. Might be too dangerous in highly seismic and sinkhole areas. Would leave the outside for parks and nature.
* use IR opaque glasses for windows
just let the visible light in, not the heath light, so that it gets reflects both from the outside to keep the room cool in summer and from the inside to keep it warm in the winter.
* use opaque-adjustable glasses for windows
because mot of the energy of light come in precisely at visible light wavelength, being able to control the amount that is in is important. This is already done with adjustable blinds or diaphragms in front of the windows.
But Would be much better if it didn't impair vision. What about micro blinds not seeable (i.e. visible)? Or a layer of LCD so that the amount of light is adjustable? LCD requires power but this may less than AC. Stable LCD that maintain state without electricity.
Poor could get cheap one piece LCD, while rich could get actual screen able to make shades and fancy patterns. There is definitely a business to do here.
* store surplus electricity and release it to match consumption peaks
the idea is to have a constant supply of energy able to manage both baseline consumption and peak production. Energy can be produced outside peak hours and then released. Not having to switch on and off facilities is also a benefit.
for instance flywheel energy storage could be used.
nuclear energy comes to mind as a particularly suitable primary source, but so is solar energy, captured during the day and released at night.
* collect energy in the deserts
not oil of course, but what desert provide the most: space, sun exposition and day-night temperature difference.
I have in mind specifically the sand deserts of Africa and Arabia. They are very vast, very scarcely populated and intensely illuminated. Even large ugly energy production facilities would not bother people.
The energy collected should be either transformed, or pumped through a grid directly to distant foreign countries.
As always the questions are: would be this efficient, and if not, when will it become?
* capture the energy of heat with anti-refrigerators
build heat pumps and heath concentrators. This way heath collected at bellow boiling temperature can be collected and gathered at above boiling point temperature. The aim is to be able to produce heath or electricity (by associating turbines) in any place and any conditions.
In a way we aim at building an anti-refrigerator: we capture the heath outside a confined space, therefore cooling the exterior, in order to heath the interior of the box.
What are your crazy ideas for reducing energy consumption?
24/11/2014
Management, Manipulation, Mensonges et Mauvaise foi : comment mon ex chef m'a mis la misère
Mon ancien chef ne touche pas une bille en science et n'a aucune morale, mais n'est pas pour autant nul en tout, bien au contraire. Il est deux domaines dans lesquels il excelle: la manipulation et le management. Voyons plus en détails comment il s'y est prends.
Le premier signe qu'il est très fort en manipulation, c'est qu'il parvient toujours à ses fins. Pas une fois à la fin d'une réunion il n'a été décidé autre chose que ce qu'il avait déjà décidé.
Le deuxieme signe qu'il est très fort en manipulation, c'est que ces décisions de fin de réunions, ce n'est jamais lui qui nous demandait directement de les réaliser, mais nous qui nous y engagions.
Le troisième signe qu'il est très fort en manipulation, c'est que quand on remet en cause certaines choses il n'hésitait à mentir pour lui donner un aspect favorable.
Le dernier signe qu'il est fort en manipulation, c'est quand on pleure seul de rage sur son lit en repensant à ce qu'il nous a fait vivre. Cela ferra l'objet d'un autre post sur mon expérience avec le harcellement moral.
Le véritable signe qu'il est fort en manipulation c'est que j'en suis à ce point mal que près d'un an après les faits j'en vienne à écrire à propos de cela un post long et détaillé. Cela est loin d'etre ma première tentative de comprendre ce qui m'était arrivé, mais la première qui a porté.
Le véritable signe qu'il est fort en manipulation c'est que j'en suis à ce point mal que près d'un an après les faits j'en vienne à écrire à propos de cela un post long et détaillé. Cela est loin d'etre ma première tentative de comprendre ce qui m'était arrivé, mais la première qui a porté.
Je reviendrai plus tard sur ses techniques de manipulation, je veux d'abord présenter ses techniques de management, qui, elles, pretent moins à controverse, quoi que nous verrons que c'est en fait loin d'être aussi simple.
* Management
Ses techniques de management sont largement bonnes, je les présente ici, et elles sont indénibablement positives si on les juges à leur valeur faciale. Moi meme je les sentais comme habilles et sensées quand je les vivais.
Cependant nous verrons juste après les revers de ces méthodes en pratique.
Cependant nous verrons juste après les revers de ces méthodes en pratique.
règles et leurs aspects positifs
MG1) Il est très souriant, très aimable et a l'air de sincèrement se soucier de vous. Et je pense que c'était largement le cas, dans la mesure néanmoins où il a besoin de vous. Il est presque affable meme quand il vous propose à boire et des petits gateau presque systèmatiquement et qu'on commence les réunions en bavardant. On se sent à l'aise, on le sent spécial, on se sent spécial, on a envie de ne pas le decevoir.
MG2) Il fait en sorte que chaque réunion se termine par une decision claire, et ce que l'on doit (i.e. les autres doivent) avoir fait pour la prochaine fois. C'est assurément une bonne pratique.
MG3) Lorsqu'on lui écrit un rapport il faut qu'il ai une structure précise: après avoir expliquer tous les détails de ce qui a été fait, il faut écrire à la fin trois choses: un résumé très succint des résultats, une liste de suggestions d'actions et préicser une action particulière que l'on recommande. L'idée étant qu'il puisse se contenter de lire la conclusion et les suggestion sans avoir à lire le reste pour pouvoir décider.
MG4) Lorsque vous avez une idée personnelle il dit: "je suis très content que tu aies pensé à cela, et je pense que ca sera très bien que tu travailles dessus, mais après, je pense que la priorité maintenant est de travailler sur telle autre chose", et cette chose est toujours ce sur quoi il veut qu'on travaille.
MG5) Il est très entousiaste et ambitieux à propos de vous, à propos du projet et à propos de son approche. Il déploie force de superlatifs et adjectifs positifs. On se sent très motivé, on a à coeur de réussir et on y croit. La vision qu'il a pour nous et notre travail, dit-il, est de publier dans "science" ou "nature", rien de moins.
MG6) il est préfère le travail de fond aux petites modifications incrémentales, et donc est ok par rapport au fait de publier peu, dans l'idée de publier très bien. Voilà qui semble avant gardiste.
MG7) il dit souvent aux gens qu'ils doivent "push extra hard" sur leur travail. Il dis cella en contrepartie de requetes personelle, meme minime, et il dit cela aussi quand le travail n'avance pas car il ne donne pas les résultats qu'il souhaite.
MG8) il supervise de loin, infrequement et rapidement, ce qui a pour but de lui dégager un maximum de temps libre pour lui meme et de laisser un maximum les autres s'occuper du travail concret.
MG9) il laisse à ses subalternes de l'autonomie.
aspects négatifs des règles
MG1: une fois qu'on a percé à jour que quand il n'a pas besoin de nous il pourrait tout aussi bien nous planter un couteau dans le dos, on perd toute confiance en lui et toutes ses mimiques de sympathiques deviennent repoussantes.
MG2: avoir une décision est bien... quand il y a des décisions à prendre. Prendre des décisions meme quand il n'y en a pas à prendre est inutile et témoigne d'un manque de vision et d'assurance. Prendre des décisions quand on a pas eu le temps d'aborder les différents aspects de la question est mal. Prendre des décisions sans avoir pris en compte le feedback des derniers résultat est très domageable. C'est pourtant ce qui se passe avec lui presque à chaque réunion.
Pire encore, sa constance dans l'insconstance: il ne se souvient jamais de ce qu'on a discuté ni décidé lors d'une dernière réunion, et en pratique presque jamais il ne s'intéresse pas à cette décision qu'il prend bien soin d'établir.
Pire encore, sa constance dans l'insconstance: il ne se souvient jamais de ce qu'on a discuté ni décidé lors d'une dernière réunion, et en pratique presque jamais il ne s'intéresse pas à cette décision qu'il prend bien soin d'établir.
MG3: il ne lit pas les rapports, et le plus souvent pas meme les conclusions. il ne cherche pas à comprendre les détails de pourquoi les résultats ne marche pas. Il est impossile dans ce cas de faire des choix informés. La plupart des décisions qu'on a prise - qu'il a prise - sont des directions aléatoires. Et non seulement il l'assume mais il le proclame.
MG4: en pratique on ne travails jamais sur ses idées personnelles et toujours sur les siennes. Donc il ne considère pas ses postdoc comme des chercheurs mais comme des "super larbins" i.e. des étudiants de Master qui travaillent plus vite. Si ses idées étaient bonnes ou qu'il avait une expertise du domaine, ca pourait encore passer, mais ce n'est pas le cas, et se priver de l'avis, du talent, de l'experience et de l'immagination de ses collaborateurs est une grave erreur.
MG5: cet enthousiame avait pour seul but de nous motiver à travailler avec ardeur. Il nous présentait une carotte gigantesque au bout d'un travail énorme. En vérité cella était illusoire. Lui meme n'a publié qu'une fois dans la revue science et en tant qu'auteur le moins significatif d'un article. Qui plus est le terme qu'il utilise le plus est "cool" et c'est de loin son critère le plus important pour décider d'une direction de recherche et juger la qualité d'un papier ou d'un travail. Ce qui va sans dire mais mieux en le disant, est extrèmement réducteur et peu approprié.
MG6: une telle approche en science est tout à fait louable. Cependant lui meme avant de se permetre cela a fait tout un tas de petites publications futiles. Si on ne publie pas c'est plus grave pour nous, postdocs et étudiants, que pour lui.
MG7: travailler trop c'est mal. Cella pousse les gens à l'épuisement voir au burn out. A ma connaissance nous sommes au moins deux à avoir fait un burn out pendant l'année que j'ai passé sous sa direction. Et cela c'est très grave car c'est une perte de productivité et de motivation à long terme. Tout ca le plus souvent pour des gains à court terme. Troquer des requetes contre du travail en plus, fait tout à fait sens logiquement, mais en pratique laisser les gens vivre leur vie et accepter des petites aménagement et pertes rapporte plus que de vouloir les gérer comme des machines. La gestion comme des machine marche si bien que les gens se cassent.
MG8: Il se comporte comme si il était un roi au dessus du travail et de ses détails et qu'il avait pour fonction de régner sur une cours d'étudiants. Il en a le flègme et l'arrogance, comme si tout ce qu'il faisait pour nous était une concession bienveillante. Sa responsabilité s'arrete cependant bien vite, il dit que pour lui idéalement son role est de trouver un titre d'article une idée de figure principale, et de laisser les autres faire le reste. Son travail consiste alors à aller dans des réunions et à dire et écrire qu'il est très occupé, mais à être rarement présent dans tous les cas. A etre aussi loin des réalité du travail il devient un mauvais leader, à se donner des airs plus qu'à faire il perds la confiance des autres.
MG9: les gens autonomes sont plus motivés et connaissent à fond leur sujet. Cependant il n'écoute aucun feedback de ceux qui ont les mains dans le camboui et c'est lui qui prends toutes les décisions. En étant aussi éloigné du travail il prends des mauvaise décisions, en ne faisant pas confiance à son équipe pour etre autonome dans l'approche du problème, il se coupe de beaucoup de possibilités et rend son équipe moins efficace.
Au final je soupconne que le facteur principal est que cela n'était pas sa spécialité il ne savait pas quoi publier et dans quelle direction rechercher. Ses compétence de management ne sont pas mauvaise. Elle sont cependant appliquées avec cynisme et hypocrisie par quelqu'un qui ne sait en réalité pas dans quelle direction rechercher.
* Manipulation
A bien y regarder plusieurs de ses techniques de management sont pour la plupart des techniques de manipulation. Retrospectivement, après avoir écrit ce post, je me suis rendu compte que la plupart des techniques sont celles de "How to win friend and gain influence" de Dale Carnegie, et que mis à part la remise en cause de sa personne il les applique presque toutes.
aspect manipulatoire des règles de managements
MG1: etre sympathique et se rendre sympathique, est très bien, sauf quand c'est avec des arrières pensées. Se faire le sympa, est probablement la plus simple des techniques de manipulation, il est plus difficile de dire non ou decevoir qqun qui semble vous accorder de la valeur. Le fait que l'on commence à travailler uniquement lorsqu'on est à laise avec ses gateau et la conversation est une manière de manipuler notre état pour que l'on soit plus consensuel et réceptif. On se sent à la fois à l'écoute et redevable de ses largesses.
MG2: prendre une décision n'est pas de la manipulation, sauf qu'il fait en sorte que ca soit systèmatiquement vous qui énonciez la décision "je dois faire X" il ne dit jamais lui meme "tu dois faire X" mais toujours "ca serait bien d'étudier X, qu'est ce que tu va faire?". Cela est une technique de manipulation qui exploite le fait que lorsqu'on s'engage verballement on se dédie beaucoup plus à la tache.
MG4: il ne dit pas frontallement non à notre idée personnelle, au contraire il en dit beaucoup de bien! Il dit simplement qu'on pourra la faire après qu'on ai fait autre chose, ce qui ne vient jamais. Et meme cela il le dit le moins directement possible: "je pense qu'il vaut mieux commencer par (mon) X et ensuite faire (ton) Y, est ce que tu es d'accord?". Avec un tel compromis il est difficile de protester et on aquièce, c'est une stratégie de reciprocité accompagné par des mots doux et positifs. Qui plus est, comme dans M2, cela nous fait engager verballement à faire ce qu'il a dit, sans pour autant nous en avoir donner l'ordre!
MG5: l'entousiasme été produit par une promesse très grosse et très vague, mais pas logiquement intenable. De plus elle semblait dépendre de notre ardeur au travail, alors que ce n'est uniquement le cas. Cependant ni que le but était hypothetiquement atteignable ni que ca depende de notre ardeur n'était contestable. Donc on a tendance à engoufrer tous ses efforts de manière déraisonnée.
MG6: Ne pas pousser à publier dans un environnement publish or perish n'est pas raisonable en pratique. C'est avant gardiste ou suicidaire. Et quand on comprends que son personnel est pour lui du personnel remplacable à volonté, c'est meutrier d'un point de vue académique. La manipulation consiste à faire croire que c'est bien pour nous, alors qu'en réalité c'est surtout bien pour lui: pour qu'il publie un travail de fond qui lui valle une grande reconnaissance qu'il désire tant.
MG7: encourager les gens est une chose, les tromper en est une autre. Les encourager à l'effort quand soi meme on ne fait rien c'est bas, et quand en plus on a quelque chose à en retirer c'est encore plus bas. En cella il se sert de l'autorité et non de la seduction pour pousser les gens à agir.
MG8: la manipulation consiste à se donner une image de génie et de personne très occupée afin d'exercer de nous pousser à etre le plus bref dans nos rapports avec lui et le plus soumis possible à ses décision quasi divinnes.
MG9: des gens qui se sentent responsables et autonomes sont plus motivés. Mais le piège est qu'ils sont autonomes uniquement pour faire ce que lui a décidé de faire: tout raisonnement et approche indépendante est au contraire systèmatiquement découragée (ce qui en recherche est un comble, doublement quand ca n'est pas son domaine d'expertise). Au final on devient une machine à executer ses volontés, ce qui est déshumanisant, et on est trompé sur notre situation.
* Mensonges et Mauvaise foi
Ses mensonges les plus fréquents était de nous faire croire en ses compétence et qu'en travaillant extrèmement dur selons ses directives on aurait des résultats incroyablement bons que l'on publiera dans les meilleurs revues du monde. Cela était son mantra pour nous tous. Il nous regardait en souriant en ayant l'air insipiré et le répétait sans cesse, plusieurs fois à chaque fois qu'on le voyait. On y a tous cru, la plupart y croient sans doute encore.
Quelques exemples:
- le covoiturage de la misère
Dans l'entretien d'embauche mené à distance il avait dit que je devrais vivre dans la ville B, où il habitte ainsi que les 2 autres membres de l'équipe, bien que l'université et les étudiants se trouvent dans la ville A. La raison: lui meme ne vas pas souvent à A et ca serra plus facile pour faire des réunions, qui plus est on fait du covoiturage pour aller à A. En pratique c'était bien différent: peu de réunion - mais je suis immédiatement accessible pour lui, le covoiturage est l'exception quand il a besoin de terminer une réunion avec nous plus tot - pas la règle ni l'esprit de partager le trajet, et surtout il s'est produit l'histoire vexante suivante.
Après m'avoir amené 3 fois et en étant (lui) apparement ravi de la conversation de son nouvel employé, à la quatrième quand les nouveautés s'étaient taries il a dit: "tu sais je pourrais ne pas t'ammener, et si je te prends je pourrais te laisser sur le bord de la route aussi...". J'étais baba, je ne comprenais pas de quoi il parlait. Il a dit que "d'autres pourraient aussi ne plus avoir envie de me prendre si tu ne fais rien". Au final ce qu'il voulait - et dit indirectement en me laissant prononcer le mot - c'était de l'argent. Je lui ai demandé quels étaient ses arrangement avec son autre postdoc (avec qui il covoiturait sans problème): il payait l'essence, et ce jour là il venait juste de payer l'essence. Il a donc dit "c'est bon pour cette fois, je te fais une fleur je t'apporte".
[ pour le contexte et comprendre à quel point c'est petit de sa part: il roule dans une Mercedes extrèmement luxueuse, il gagne 12000 euro par mois et dans ce pays le plein coute moins de 15 euro ]
- mission phantome et réelle hypocrisie
Le premier papier que nous avons eu accepté dans une conférence était aux USA. La conférence était également 6 mois après la fin de mon contrat. Je lui ai demandé lors de mon dernier jour s'il me paierait la conférence. Il a dit que non, que si je voulait y aller peut etre mon prochain labo me paiera la voyage. Quelle mauvaise foi! Si lui meme n'était pas pret à me payer un voyage pour un travail qui l'interesse directement, comment pouvait il me faire gober qu'un insititut hypothétique qui n'aurait aucun intéret me paye le voyage. Ce sur quoi il ajoute: "mais peut etre une étudiante ou moi nous pourrons y aller et ca serra bien, ca ferra connaitre le travail".
Pour une fois je me suis pas laissé démonter par un argument bidon, j'ai répondu calmement: "l'intérêt d'aller à une conférence n'est pas de présenter un travail mais de faire du networking, rencontrer des gens". Ce à quoi il a répondu avec un air appeuré "c'est vrai". Et j'ai meme ajouter que j'en avais particulièrement besoin, ce qui n'a bien évidement rien changé à l'affaire.
premier mensonge: faire croire qu'un autre institut paiera pour moi, afin que je ne le solicite pas.
deuxième mensonge: présenter une conférence sous un jour trompeur afin de me faire croire que j'aurai autant à gagner en n'y allant pas et que donc je ne le solicite pas
Le fin mot de l'histoire, plusieurs mois après, c'est lui qui y est allé et il y a fait beaucoup de networking. Et pour y aller j'ai du lui envoyer un papier afin qu'il se fasse rembourser ses frais. Il a vraiment eu le beure et l'argent du beure dans cette histoire (et la laitière en plus, si je vous racontais l'histoire entière).
- un, deux, trois mensonges
Lorsque le principal projet sur lequel j'ai travaillé sous sa férule ne donnais toujours pas de résultats, j'ai décidé de mon propre chef de travailler sur un projet secondaire que j'avais laissé de coté. Je l'en avais informé, et au bout d'une semaine, alors que déjà des résultats commencaient à venir il m'a dit "jai décidé qu'il vaut mieux arreter l'ancien projet et se concentrer sur celui là". Sans blagues... Il faut pas qu'il perdre la face je comprends. Mais ca montre bien qu'il n'accepte pas ses responsabilité, à savoir l'échec de l'ancien projet pour lequel il m'accuse, et qu'il n'accepte pas que d'autre personnes que lui peuvent avoir des intitiative - pire, des bonnes initiatives. Et donc pour couvrir cela il fait croire que c'est lui qui prends les bonnes décisions retrospectivement. J'ai pas mouffeter pour ne pas jeter de l'huile sur le feu, mais c'est sur que d'autres ne pourront que le croire.
* Attitude personnelle
Quelques autres traits de son caractère:
- un dédaigneux
parler derière le dos des gens et en dire du mal d'une manière extrèmement crues. Par exemple un étudiant très sympa mais qui avait du mal dans ses études: bien loin de se pencher sur son cas pour l'aider (ce que j'aurai fait, et ce qui est le role d'après moi d'un enseignant) il l'appelait "the looser" et essayait d'en rire avec les autres professeurs.
par rapport à un labo particulier que nous connaissons tous les deux, il dit c'est bien ce qu'il font, mais c'est assez "meh" (i.e. un "bof"). Il dit de meme du reste de la recherche en informatique. Il aime particulièrement dire que ce que font les autres c'est nul et qu'ils sont mauvais, sont visage s'éclaire vraiment à ces moments là.
- un opportuniste
Sa spécialité initialle était l'argumentation, qui est une sous thématique de l'intelligence artificielle et de l'informatique. La spécialité de l'équipe qu'il dirige est "l'informatique appliquée aux sciences sociale" avec si possible du big data.
Il clame haut et fort ne pas s'intéresser à l'informatique et vouloir travailler dans le domaine de "computational sociale science" par ce que c'est plus facile de publier dans des journeaux prestigieux. Et en effet c'est un travers reconnu de cettes branche: il y a bcp de publications assez creuses qui se retrouvent effectivement très haut.
Il travaillait sur le crowdsourcing tant que c'était à la mode, mais pas trop à la mode. En effet, une fois que tous les résultats le plus simple à obtenir on été obtenu, effectuer un travail de qualité, meme moyenne, requiert trop de connaissance de la litérature et trop de connaissance spécifique, il a décidé d'arreter et de passer à autre chose.
Il choisit un domaine de recherche non pas par interet scientifique, mais par facilité de publier afin d'obtenir ce qu'il veut: de la célébrité. Il l'a fait sur le conseil d'un ami (Manuel Cebrian) très bien connecté dans ce millieu là justement et vers lequel il se tourne systématiquement pour toute décision scientifique importante.
- tous pour lui et lui pour lui
Il est quelqu'un qui aime beaucoup sa propre personne et qui ne lui refuse rien: splendide appartement, splendide voiture, splendide emploi. Son ambition personnelle est à court terme l'argent, à long terme la célébrité, et dans tous les cas le pouvoir.
Il n'aime pas beaucoup travailler par contre, et il n'aime pas non plus son domaine de recherche: l'informatique en générel et quoi que ce soit d'autre en particulier. Pour réussir il doit donc s'appuyer sur le travail d'autruit. Et c'est en cela qu'il est le plus fort. Il sait extrèmement bien faire deux choses: tirer son épingle du jeu et faire travailler les autres pour lui.
Il a une éthique: l'individualisme pur et dur et le respect des contrats, ses maitres à penser se trouve dans l'extreme droite américaine. Il a une éthique mais il n'a pas de morale: il détourne au maximum profitable pour lui les termes d'un accord.
C'est pourquoi il est charmant avec vous tant qu'il a besoin de vous et n'hésite pas à se montrer odieux autrement.
Il n'a aucun fidélité envers son institut de recherche et dit n'y rester que tant qu'il est ok avec les conditions. Il n'a aucune fidélité envers son domaine de recherche et dit n'y rester que tant que c'est facile d'avoir des résultats rapidement. Il n'a aucune fidélité envers les gens, il les utilise, ca il ne le dit pas, mais je l'ai experimenté directement.
- le gout de l'argent
Je me rappelle qu'il devait négocier avec d'autre professeur un contrat de collocation où lui ne serait jamais dans l'appartement (pour des raisons administratives). Il a apprement négocié afin de payer le moins possible, mais les autre lui au final fait payer un tarif presque normal. Ces deux personnes m'avait dit qu'il est très pingres.
Ce que j'avais déjà pu constaté par ailleurs. Lorsque je venais de m'installer dans mon appartement il a proposé de me vendre une chaise de travail. Son ancienne chaisse de travaille. Celle-ci était nettement usée, mais était encore assez fonctionelle. Il me l'a vendu un peu plus que le moitié prix d'une neuve. Ce que je ne savais pas, car je ne connaisais pas le prix d'une chaise. J'ai appris à mes dépends quand j'ai voulu la revendre par la suite que ce genre d'objet est presque invendable, surtout dans la condition ou il était. Je n'aurai jamais imaginé qu'on puisse etre aussi minable, mais il l'a été, au lieu de me donner magnaniment une chaise qui ne vaut pas grand chose.
Il n'y a pour lui pas de petites économie, dès qu'il est question d'argent il devient intraitable, meme si la somme est dérisoire. J'ai déjà parlé du fait que bien qu'il gagne 12000 euros par mois il se faisait payer son essence par un autre postdoc (15 euros le plein). Essence qui bien entendu ne couvre pas seulement les trajets de covoiturage. Tous services appel à un rendu, mais il ne se gène pas pour se servir plus si il le peut.
Mais là ou il est le plus intraitable c'est pour son sallaire qu'il et ses grosses dépense qu'il négocie toutes extrèmement attentivement, longuement et minuteusement. Je n'étais pas là bien sur, mais je l'ai souvent entendu parler de cela.
- sa faiblesse à lui
Iyad Rahwan est scientifiquement faible (seulement 3 ans dans une discipline sans aucun rapport avec son travail précédent) et il le sait. Il assume ne pas connaitre un assez grand nombre de choses qu'il devrait avec un peu trop de décontraction, et ne reconnait pas en connaitre certaine avec un peu trop de fierté.
Plus que remettre en cause sa compétence qu'il sait faible, ce qui est facillement écartable car personne dans le labo ne s'y connais vraiment à ce sujet, c'est remettre en cause son pouvoir qu'il ne supporte pas: qu'on ne soit pas en admiration ou tu du moins soumis devant lui, c'est ca le vrai problème. Cela fait s'écrouler les fondement de son masque de responsable scientifique: sans celui ci il est nu, et sans le travail aveugle et incontestés de ses subalternes il n'est rien.
Mis en difficulté, quand on insiste pour négocier ou faire falloir son interet, il réagit par la violence et la menace, et comme il a du pouvoir par la vengeance également. Toutes ses douceurs apparentes disparaissent pour le laisser apparaitre tel qu'il est vraiment: un monstre d'égoisme et de domination.
* Conclusion
Il avait visiblement beaucoup lu à propos de management et de manipulation. Ce n'est pas très étonnant dans la mesur ou sont domaine de recherche précédent était la négociation et l'argumentation.
Il a donc appliqué des techniques qui était bonnes mais 1) sans intégrité à cause son amoralité 2) et sans écologie à cause de son égoisme aigu et revendiqué et 3) sans direction à cause de son manque de compétence.
La plupart de celles-ci font recours à de la manipulation, ce qui est mauvais à long terme. Et il a aussi pas appliqué d'autres bonnes techniques de manangement à cause de son besoin de pouvoir et de domination.
Quitter ce chef était ma meilleur décision cette année là. J'ai donc voté avec mes pieds en quittant l'entreprise. Il a été vraiment surpris quand je lui ai annoncé et n'a rien dit sur le moment.
Quelques jours plus tard il a au final dit que le département de ressources humaines ne pouvait prolonger mon contrat, un dernier défaussement de responsabilité de sa part.
Mefiez vous de lui si vous avez affaire à le rencontrer un jour et des manipulateurs sournois en général.
Et si vous avez une ame innocente et ingénue, lisez tout de suite un livre sur la manipulation pour ne jamais vous faire avoir, il vaut vraiment mieux prevenir que guérir dans ces cas là.
15/11/2014
Rosetta Mission
I can not judge neither the scientific nor the engineering achievement of the Rosetta/Philae mission. However, the biggest success of the mission so far I can confirm was a marketing achievement.
I have been hopping for something like that to happen for years and I am very glad that this happened at least. As an European I have always been surprised about how in every outlet we can read "NASA did this", "NASA plans to do that" almost all the time while at the same time reading close to nothing about European achievements and plans. Because I knew European already achieved, were achieving and planned to achieve some impressive shit too.
It seems that the mindset in the USA is to be incredibly proud of oneself and bragging about anything you achieved and share it live and passionately. In Europe the mindset seems to be quite different: be insecure about one's achievements, never brag about them, wait for your projects to finish before even talking about them, and when talking about them do it in the most formal, neutral and concise way.
If these cultural prejudices bear some truth, and I believe they do, no wonders why NASA is so popular and the ESA so unknown. And it is not just about ESA, the same goes on for other European scientific efforts. For instance, earth based observatories (VLT and ALMA anyone?), space based observatories (Gaia and Hershel anybody?) and particle physics - with the exception of CERN which is too big to ignore.
Granted that NASA did achieve much more than ESA: landing men on the moon, landing (and crashing) robots on Mars, space shuttle. But ESA crashes robots on Mars too! And would have done it as early as 1996 had not the Russian rocket carrying it exploded. But these are mostly part of history.
The rest is less impressive and on par with ESA. ISS is an international effort, the Hubble space observatories has ESA instruments on board and the Cassini and Ulysse missions where joint NASA/ESA work. However, NASA overflows public with images, while ESA barely outreaches, not that it hasn't anything to show for!
Europeans, including scientists, don't like the spotlight and mixing PR with work, fortunately with the Rosetta mission ESA came out of its torpor to claim its fair share of fame in the world imagination.
The PR was brilliant:
- a very impressive (and costly?) short feature starting a recognizable major actor from one of the most followed series
- public engagement over twitter with live update as if Philae was an human. The account reached 330k followers. Public engagement to choose a name and flying the winner to Darmstadt.
- news outlets reported a lot about the mission from the moment is was reported that the comet smelled bad. Such trivia is people's crack, but it really got people interested.
- the focus on the uncertainty of each operations, like everything could fail at any point yet suceeded, was catchy if not entirely honest
In a word it was entertaining, hooking and somehow inspiring.
As a consequence, people form around the world felt emotionally connected with the mission, and specially with the lander Philae. The fact that its batteries were about to die quickly made it even more alive. Even if money was spent not on science, the fame and connection if gained to the agency was totally worth it, because it will help attract talented scientists to ESA and Europe in general.
It is one of the most cruel lesson of life for shy task-oriented introverts and achievers like me and one of the happiest facts for bold people-oriented extroverts like slackers and publicists: what you did is not half as important to others as how you present it - event if you did great.
Now on the engineering side. I wonder. I wonder how they could be smart enough to make Philae land with the precision of about a meter and how they could not be smart enough to have it bounce rather than land...
Coming at the smallest speed possible seemed the best thing to do... with small downward thrusters to prevent gaining to much momentum, instead Philae had a big upward thruster to repel it back to the comet once it landed and would had started bouncing.
Also having Philae landing with just ballistic impulse of rosetta and no position correction is an astonishing feat. Yet it bounced, rather than landed. Strange that if they managed to get this right they didn't manage to get simpler stuff right.
Event technically... thruster not working while this is core of space tech and harpoons not working while this is super low tech that is disappointing. But the PR was good enough to make us not see these blatant failures.
Even when it "landed", it was said that one of its foot didn't touch the ground, there are three stable realities that correspond to this: be on the edge of cliff, be on uneven ground, or more prosaically lying on the side. Good PR again ESA, keep it this way to make us dream even more!
I have been hopping for something like that to happen for years and I am very glad that this happened at least. As an European I have always been surprised about how in every outlet we can read "NASA did this", "NASA plans to do that" almost all the time while at the same time reading close to nothing about European achievements and plans. Because I knew European already achieved, were achieving and planned to achieve some impressive shit too.
It seems that the mindset in the USA is to be incredibly proud of oneself and bragging about anything you achieved and share it live and passionately. In Europe the mindset seems to be quite different: be insecure about one's achievements, never brag about them, wait for your projects to finish before even talking about them, and when talking about them do it in the most formal, neutral and concise way.
If these cultural prejudices bear some truth, and I believe they do, no wonders why NASA is so popular and the ESA so unknown. And it is not just about ESA, the same goes on for other European scientific efforts. For instance, earth based observatories (VLT and ALMA anyone?), space based observatories (Gaia and Hershel anybody?) and particle physics - with the exception of CERN which is too big to ignore.
Granted that NASA did achieve much more than ESA: landing men on the moon, landing (and crashing) robots on Mars, space shuttle. But ESA crashes robots on Mars too! And would have done it as early as 1996 had not the Russian rocket carrying it exploded. But these are mostly part of history.
The rest is less impressive and on par with ESA. ISS is an international effort, the Hubble space observatories has ESA instruments on board and the Cassini and Ulysse missions where joint NASA/ESA work. However, NASA overflows public with images, while ESA barely outreaches, not that it hasn't anything to show for!
Europeans, including scientists, don't like the spotlight and mixing PR with work, fortunately with the Rosetta mission ESA came out of its torpor to claim its fair share of fame in the world imagination.
The PR was brilliant:
- a very impressive (and costly?) short feature starting a recognizable major actor from one of the most followed series
- public engagement over twitter with live update as if Philae was an human. The account reached 330k followers. Public engagement to choose a name and flying the winner to Darmstadt.
- news outlets reported a lot about the mission from the moment is was reported that the comet smelled bad. Such trivia is people's crack, but it really got people interested.
- the focus on the uncertainty of each operations, like everything could fail at any point yet suceeded, was catchy if not entirely honest
In a word it was entertaining, hooking and somehow inspiring.
As a consequence, people form around the world felt emotionally connected with the mission, and specially with the lander Philae. The fact that its batteries were about to die quickly made it even more alive. Even if money was spent not on science, the fame and connection if gained to the agency was totally worth it, because it will help attract talented scientists to ESA and Europe in general.
It is one of the most cruel lesson of life for shy task-oriented introverts and achievers like me and one of the happiest facts for bold people-oriented extroverts like slackers and publicists: what you did is not half as important to others as how you present it - event if you did great.
Now on the engineering side. I wonder. I wonder how they could be smart enough to make Philae land with the precision of about a meter and how they could not be smart enough to have it bounce rather than land...
Coming at the smallest speed possible seemed the best thing to do... with small downward thrusters to prevent gaining to much momentum, instead Philae had a big upward thruster to repel it back to the comet once it landed and would had started bouncing.
Also having Philae landing with just ballistic impulse of rosetta and no position correction is an astonishing feat. Yet it bounced, rather than landed. Strange that if they managed to get this right they didn't manage to get simpler stuff right.
Event technically... thruster not working while this is core of space tech and harpoons not working while this is super low tech that is disappointing. But the PR was good enough to make us not see these blatant failures.
Even when it "landed", it was said that one of its foot didn't touch the ground, there are three stable realities that correspond to this: be on the edge of cliff, be on uneven ground, or more prosaically lying on the side. Good PR again ESA, keep it this way to make us dream even more!
14/11/2014
Are we not scientists?
One of the things that dumbstruck me the most during my last postdoc position was the open despises expressed by my former supervisor towards sciences, and more specially towards computer science.
I am fine with almost any opinions as long at consistent with the position of the person that pronounces it. In his case, he was head of a team in a computer science department in a private research institute that doubles as a private university.
And this was not just posing or something that came out once. It is something that he would pleasantly and repeatedly state in weekly and private meeting, he was both proud and serious about it.
He even bragged once that he didn't like algorithms and computer science because they were too hard, while he didn't want to go to the industry because they asked too difficult concrete problems for him to deal with.
Academia was a perfect notch were he could get an insane salary for minimal work and responsibilities. He said it and mean it, in a weekly group meeting, in front of about 10 grad students and two postdocs.
Let us get a bit deeper in the behaviour of this academic hero:
Regarding teaching, he copy/pasted his course material on open source course. I wrote and graded the exams.
Regarding research, students and postdoc did the work and he orientated them. I should rather write disorientated because the bastard had not much more idea on how to do research.
By bastard I actually mean: cocky, showy and madly in love with himself assistant professor earning 15 000 USD monthly for little achievement and involvement and which started humiliating when he couldn't manipulate me anymore.
Let me be a little more specific on his scientific incompetence:
- he despised and didn't understood computer science in general. Algorithms are for him complicated stuff he is absolutely happy not knowing and doing.
- regarding coding he loved one liners, not the smart and tantalizing ones, but the ones that are called "docomplexstuff(data, params)" from "complexstuflib". Because he had not fucking idea how to code anything
- he has little more clues about the very reasearch topic of the lab, his lab: computational social science, because he only started three years ago on the suggestion of his closest friend.
- he aim only a quick win, which is the reason why he switched field to work in computational social science: it is easy to get publication in important journal with little work to show for
His approach to science is:
- choose a title and figure that would make it to "science" or "nature", i.e. a very general and inspiring result - yet totally unfounded at the moment. (He displays disdain for anything lower ranking, yet he never get published there, and generally doesn't get published often)
- make subordinates work on it, and push them very hard to the hinge of burnout
- totally ignore the output the work if it doesn't yield what he hopped and start anew with a fresh title and figure ideas
- when really stuck, which happened on a regular basis, get back to his well connected and knowledgeable friend for advice.
- the main assets of his friends, that he was unsuccessfully trying to copycat are: being brazen, knowing a bazillion buzzwords and their reference (not the content!) and profound understanding of the the politics of scientific publishing (which for the latter his: suggest your famous friends as your reviewers)
That's it the guy didn't care a shit about doing science, i.e. pushing the boundaries of knowledge to do useful and beautiful things. All he is interested in is earning shitloads of money and being famous. Worst, he has 0 ethics and 0 morals: he would pat you in the back when he needs you and stab you when he does not.
I had to suffer this well groomed and well mannered incompetent and manipulative prick for more than a year. Needless to say that he was shit-scared by my refusal of his work ethics as well as to get caught not knowing. Quite unfortunately this guy still holds his position and as an army of grad student that can say anything because they risk being thrownout. I will expose his behaviour and attitude more precisely in another post.
Getting back to the main point of this post, this seems to happens often in science: politically-slick and scientifically incompetent persons getting to high positions. What is the point of doing science in such a case? For them, for their institution and most importantly, for us?
02/11/2014
The need for an usable and correct information management tool
THE PROBLEM
Complex domains call for tools able to manage this complexity.
By complex I mean here rich in concepts and relation between these concepts.
Obviously when solving a problem within a domain not understanding it is a recipe for failure: wrong decisions are taken because of lack and misinterpretation of data. The finished project is not relevant.
Obviously when solving a problem within a domain understanding every aspect of it is a recipe for failure: brutal abuse of resources, deadline serial murder. The finished project is not relevant anymore, cause it is too late or can't even be completed.
Lack of data impairs decision process.
Abundance of data impairs decision process.
In order to manage complexity some tools have been designed:
- graphs (capturing every concept and relation)
- mindmaps (capturing every concept and hierarchical relation)
Graphs
Given a clear relation definition, graph are complete: they can represent the exact information. Yet because they might not be easily visualisable, they are often of poor help.
Why are graph of poor help?
Imagine 10 persons, connected by 5 edges each (meaning of edges is "acquainted"):
- it is hard to see who is related to who (i.e. there is too much information to process for a human brain)
- because we don't know how and when the relation where create it is hard to see what the relation actually means to the person (i.e. there is not enough information to answer any interesting question)
Mindmaps
Mindmaps are trees, i.e. they are graphs restricted to a hierarchical structure. They make information visualisation extremely easy, but because they are incomplete they are of poor help
Why are mind maps of poor help?
Imagine 10 topics, connected by 5 edges each (meaning of edges is "relevant to"):
- the forced hierarchical decomposition impose some topics to be subtopics of others, which is not necesarily correct, therefore the apparent information is false.
- the forced hierarchical decomposition impose relation between topics to be deleted, which is wrong, therefore the apparent information is incomplete hence false.
THE SOLUTION
When designing a tool to help grasp complex topic, one has to have in mind two things:
- for which purpose the tool will be used, and if it is critical : they can be very different, it is needed to query the data in the appropriate way
- the physiological and psychic limitation of the human brain : they are very low, it is needed to present only a limited amount of data
THE FUTURE
When reading an article about a topic, we might be interested in different aspect of it and at different depths. Rather than reading the whole encyclopedia article, it would be more reasonable to display information depending on what the reader already knows and what he is looking for. At the level of a book this dynamic display of information would be akin to rewriting the story depending on where the reader picks it up and what he already knows. I believe that this is the future of information presentation.
21/10/2014
Graph library design choices
I am currently writing the documentation for my library. The effort of structuring and explaining the underlying logic of it is definitely a great help in making apparent what the library lacks and what could be done to make it better.
Unfortunately, time is limited, and I will never have the time to make the perfect Java graph library. Here are nevertheless some tips and facts to keep in mind in order to do so.
Graphs are about:
- relation between objects
And possibly:
- information about objects
- information about relation between objects
yes, but:
What?
i.e. which information?
* about the relations:
- directed or undirected
- normal graph or multigraph
- normal graph or hypergraph
- both directed and directed edges?
* ad-hoc information:
- weighted? (e.g. for min paths)
- weighted with several weights? (e.g. for max flows)
- with spatial information?
- with any other information?
Now, where, when and how to store these information?
Where?
* within the objects: in an information field of the objects
n.name = "toto";
n.name;
+: do not use hashmap
-: increases the size of a node
* outside of the graph: in a map object-> information that stores information separately from the objects and the graph
names.put(n,"toto");
names.get(n);
+: the algorithms that modify information sets leaves the graph untouched (the graph does not contains state information)
-: uses hashtables (heavy, centralised)
* within the graph but outside the objects: the same map, but stored at the level of the graph
g.names.put(n,"toto");
g.names.get(n);
+: the info about the graph is stored within the graph (easy to access)
-: the graph contains state information
-: uses hashtable
When?
* If the info associated with a node is frequently accessed it is better to store this info in a dedicated field.
+: fast
-: need to implement ad-hoc classes
* If several algorithms are going to be used on a given graph, it is therefore not possible or desirable to modify the graph, i.e. to change its state. As such it is better to store the information relevant only to an algorithm outside of the graph.
+: respect functional decomposition
+: easy to implement
-: state info is sometimes interesting or necessary, solution: return it with the standard return in a tupple
How?
* use ad-hoc node object to contain the information (information directly into the node)
+: all the info there, plus the neighbours in one place
-: need to implement specific nodes for each new application
e.g.
class CityNode {
String name;
int population;
List<CityNode> neighbours;
}
CityNode n;
* use generic node objects to contains the ad-hoc information (information pointed from the node)
+: generic, does not need to develop new nodes
-: the ad-hoc information and neighbour information are separated, from the former it is not possible to access the later
e.g.
class GenericNode<K> {
K data;
List<GenericNode<K>> neighbours;
}
class CityNodeData {
String name;
int population;
}
GenericNode<CityNodeData> n;
* use no node objects only identifiers and separated maps to contain the information (information independent from the nodes)
+: totally respect function decomposition
-: easier to store data in one place sometimes
-: edge information need two levels of indirection with hashmas, can be cumbersome
e.g.
class Node {
List neighbours;
}
Map<Node, String> mapNodeCityName;
Map<Node, Integer> mapNodeCityPopulation;
Node n;
BUT
adjacency information is node information... and the same questions applies: shall adjacency:
- be encoded within the nodes with an adjacency list field: requiring the creation of additional objects
- be encoded in map node->adjacency list : node object are not even needed in this case
And shall lists be encoded with linked lists or arrays lists?
And shall maps be encoded with hashmaps or carefully managed within arrays?
Usage of arrays is extremely compact, lightning fast for iteration thanks to caching but slower for modification.
Usage of linked list and hashmap is not compact (require to save pointer information) and is slower for iteration, but faster for modification.
two distinct use cases appear:
- high performance graph library: like grph which will take the pain of optimising in the minute details to the point of coding in C in order to save time. In order to go further, data should be arranged to match the cache access pattern within each implemented algorithm.
- user and developer friendly graph library: like any other Java library, provide slower than C highly reusable Java code based on Java concepts. That's the point of using Java. (and it is strongly typed and faster than Python: that's the point of not using Python)
It is clear that graphs that are static once loaded could benefit from a speedup over dynamic one if this is known in advance. However as these encompass two different style of coding algorithms it is not as transparent as using two different implementation for on abstract data type.
* Ease of development
The fact is that it is much easier to implement only a bare relation structure and additional information encoded outside of it in maps than implementing the ad-hoc graphical structure that precisely fit one's purpose.
e.g. for a graph of cities with two edge information, three node information including spatial ones, it is easier to specify:
Map mapNames = new Hashmap()
mapNames.put(name1)
etc...
Graph g = new Graph(g, {mapNames, mapPopulation, mapCoordinates}, {mapDistance, mapCapacity}) // not correct Java code btw, but let your imagination fly to understand
Than:
AdHocGraph g = new AdHocGraph()
g.addNode(name1, population1, coordinates1)
etc...
because this second solution require the developement of an AdHocGraph with AdHocNode able to hold precisely the information of a name, a population and a coordinate
The solution, in order to make development easier, is to be more generic and provide a class of node able to all contain an information of the same parametrized type, e.g. City, where a City object contains the precise three information we are interested in.
So the question is:
Information Within Nodes : for static information of the graph
Information Within Maps : for ad-hoc algorithms
* Sequential or parallel
But you know, we are entering the age of distributed and parallel algorithms... What we just discussed is fine for centralized and sequential processing... What about the future?
Distributed algorithms are piece of information spread over a network (...a graph...) which interact by exchanging information. If there is not central point of control, the algorithm is more than distributed, it is decentralised.
To be perfectly clear about the meanings:
- sequential: fixed workflow + 1 thread
- parallel: fixed workflow + N threads (same computer)
- distributed: fixed workflow + N threads (different computer) + communication delays + faults
- dencentralised: distributed with several independent workflows
Not all algorithms can be distributed, but also not all algorithms even need to be distributed, what we would be more interested into is harnessing the power of parallel computation than coping with the full set of difficulties of distributed computing (server faults, communication faults) and methods (data replication, process replication, consensus algorithms).
Sometimes an algorithm can not be fully distributed, but can be partially parallelized, with some part of the problem being computed independently, and possibly combined. For instance any matrix summation can be fully parallelized, with worker jobs working on independent parts of the problem and combining their solution by writing them in the result matrix. Matrix multiplication can be only partially parallelised, using Strassen algorithm for instance, because the computation over independent parts of the problem require additional computation to be combined, therefore introducing sequential dependencies over the computation that needs to be performed, and that can't therefore be all executed in parallel.
When using a decentralised algorithm there is only one choice for the location of data: inside the nodes of the graph. However for distribution and parallelisation, such is not the case. As in the sequential case, data can be stored inside the graph at the level of nodes or outside of the graph in a separate map. In distribution, storing the data makes it simpler to transmit info to processors: they just process the nodes, while storing the data in maps requires to first extract the relevant parts and transmit them along the graph to the processors. In parallelisation, all the processors can access the same memory making the split irrelevant.
The catch here is that what parts of a computation on a graph can be parallelised are determined by the graph structure itself, but all the code that parallelise, compute and aggregate piece of solution has nothing to do with the design of a graph library, still we need to take the distribution at the core of the algorithms.
Let us consider different kinds of distribution in the problem of computing shortest paths with dijkstra:
* - node level: each node execute a local dijkstra algorithms that updates its information given the message he receives and updates other nodes. The heap, which is a global data must be managed throught a consensus algorithm. Such a solution is slow due to the communication overhead and should be used when strictly necessary
* - group of nodes level: each processor is responsible for the computation of a subparts of the graph: this can provide significant speedup if communication costs or memory contention are low. Note that this woudln't be the exact dijkstra algorithms as separating the graph and assembling parts of the solution require additional work to be done in a principled way.
* - graph level: no distribution, no speedup.
Unfortunately, time is limited, and I will never have the time to make the perfect Java graph library. Here are nevertheless some tips and facts to keep in mind in order to do so.
Graphs are about:
- relation between objects
And possibly:
- information about objects
- information about relation between objects
yes, but:
What?
i.e. which information?
* about the relations:
- directed or undirected
- normal graph or multigraph
- normal graph or hypergraph
- both directed and directed edges?
- weighted? (e.g. for min paths)
- weighted with several weights? (e.g. for max flows)
- with spatial information?
- with any other information?
Now, where, when and how to store these information?
Where?
* within the objects: in an information field of the objects
n.name = "toto";
n.name;
+: do not use hashmap
-: increases the size of a node
* outside of the graph: in a map object-> information that stores information separately from the objects and the graph
names.put(n,"toto");
names.get(n);
+: the algorithms that modify information sets leaves the graph untouched (the graph does not contains state information)
-: uses hashtables (heavy, centralised)
* within the graph but outside the objects: the same map, but stored at the level of the graph
g.names.put(n,"toto");
g.names.get(n);
+: the info about the graph is stored within the graph (easy to access)
-: the graph contains state information
-: uses hashtable
When?
* If the info associated with a node is frequently accessed it is better to store this info in a dedicated field.
+: fast
-: need to implement ad-hoc classes
* If several algorithms are going to be used on a given graph, it is therefore not possible or desirable to modify the graph, i.e. to change its state. As such it is better to store the information relevant only to an algorithm outside of the graph.
+: respect functional decomposition
+: easy to implement
-: state info is sometimes interesting or necessary, solution: return it with the standard return in a tupple
How?
* use ad-hoc node object to contain the information (information directly into the node)
+: all the info there, plus the neighbours in one place
-: need to implement specific nodes for each new application
e.g.
class CityNode {
String name;
int population;
List<CityNode> neighbours;
}
CityNode n;
* use generic node objects to contains the ad-hoc information (information pointed from the node)
+: generic, does not need to develop new nodes
-: the ad-hoc information and neighbour information are separated, from the former it is not possible to access the later
e.g.
class GenericNode<K> {
K data;
List<GenericNode<K>> neighbours;
}
class CityNodeData {
String name;
int population;
}
GenericNode<CityNodeData> n;
* use no node objects only identifiers and separated maps to contain the information (information independent from the nodes)
+: totally respect function decomposition
-: easier to store data in one place sometimes
-: edge information need two levels of indirection with hashmas, can be cumbersome
e.g.
class Node {
List neighbours;
}
Map<Node, String> mapNodeCityName;
Map<Node, Integer> mapNodeCityPopulation;
Node n;
BUT
adjacency information is node information... and the same questions applies: shall adjacency:
- be encoded within the nodes with an adjacency list field: requiring the creation of additional objects
- be encoded in map node->adjacency list : node object are not even needed in this case
And shall lists be encoded with linked lists or arrays lists?
And shall maps be encoded with hashmaps or carefully managed within arrays?
Usage of arrays is extremely compact, lightning fast for iteration thanks to caching but slower for modification.
Usage of linked list and hashmap is not compact (require to save pointer information) and is slower for iteration, but faster for modification.
two distinct use cases appear:
- high performance graph library: like grph which will take the pain of optimising in the minute details to the point of coding in C in order to save time. In order to go further, data should be arranged to match the cache access pattern within each implemented algorithm.
- user and developer friendly graph library: like any other Java library, provide slower than C highly reusable Java code based on Java concepts. That's the point of using Java. (and it is strongly typed and faster than Python: that's the point of not using Python)
It is clear that graphs that are static once loaded could benefit from a speedup over dynamic one if this is known in advance. However as these encompass two different style of coding algorithms it is not as transparent as using two different implementation for on abstract data type.
* Ease of development
The fact is that it is much easier to implement only a bare relation structure and additional information encoded outside of it in maps than implementing the ad-hoc graphical structure that precisely fit one's purpose.
e.g. for a graph of cities with two edge information, three node information including spatial ones, it is easier to specify:
Map mapNames = new Hashmap()
mapNames.put(name1)
etc...
Graph g = new Graph(g, {mapNames, mapPopulation, mapCoordinates}, {mapDistance, mapCapacity}) // not correct Java code btw, but let your imagination fly to understand
Than:
AdHocGraph g = new AdHocGraph()
g.addNode(name1, population1, coordinates1)
etc...
because this second solution require the developement of an AdHocGraph with AdHocNode able to hold precisely the information of a name, a population and a coordinate
The solution, in order to make development easier, is to be more generic and provide a class of node able to all contain an information of the same parametrized type, e.g. City, where a City object contains the precise three information we are interested in.
So the question is:
Information Within Nodes : for static information of the graph
Information Within Maps : for ad-hoc algorithms
* Sequential or parallel
But you know, we are entering the age of distributed and parallel algorithms... What we just discussed is fine for centralized and sequential processing... What about the future?
Distributed algorithms are piece of information spread over a network (...a graph...) which interact by exchanging information. If there is not central point of control, the algorithm is more than distributed, it is decentralised.
To be perfectly clear about the meanings:
- sequential: fixed workflow + 1 thread
- parallel: fixed workflow + N threads (same computer)
- distributed: fixed workflow + N threads (different computer) + communication delays + faults
- dencentralised: distributed with several independent workflows
Not all algorithms can be distributed, but also not all algorithms even need to be distributed, what we would be more interested into is harnessing the power of parallel computation than coping with the full set of difficulties of distributed computing (server faults, communication faults) and methods (data replication, process replication, consensus algorithms).
Sometimes an algorithm can not be fully distributed, but can be partially parallelized, with some part of the problem being computed independently, and possibly combined. For instance any matrix summation can be fully parallelized, with worker jobs working on independent parts of the problem and combining their solution by writing them in the result matrix. Matrix multiplication can be only partially parallelised, using Strassen algorithm for instance, because the computation over independent parts of the problem require additional computation to be combined, therefore introducing sequential dependencies over the computation that needs to be performed, and that can't therefore be all executed in parallel.
When using a decentralised algorithm there is only one choice for the location of data: inside the nodes of the graph. However for distribution and parallelisation, such is not the case. As in the sequential case, data can be stored inside the graph at the level of nodes or outside of the graph in a separate map. In distribution, storing the data makes it simpler to transmit info to processors: they just process the nodes, while storing the data in maps requires to first extract the relevant parts and transmit them along the graph to the processors. In parallelisation, all the processors can access the same memory making the split irrelevant.
The catch here is that what parts of a computation on a graph can be parallelised are determined by the graph structure itself, but all the code that parallelise, compute and aggregate piece of solution has nothing to do with the design of a graph library, still we need to take the distribution at the core of the algorithms.
Let us consider different kinds of distribution in the problem of computing shortest paths with dijkstra:
* - node level: each node execute a local dijkstra algorithms that updates its information given the message he receives and updates other nodes. The heap, which is a global data must be managed throught a consensus algorithm. Such a solution is slow due to the communication overhead and should be used when strictly necessary
* - group of nodes level: each processor is responsible for the computation of a subparts of the graph: this can provide significant speedup if communication costs or memory contention are low. Note that this woudln't be the exact dijkstra algorithms as separating the graph and assembling parts of the solution require additional work to be done in a principled way.
* - graph level: no distribution, no speedup.
Subscribe to:
Posts (Atom)