1851-1900 of 10000 results (43ms)
2019-02-17 §
21:20 <bstorm_> The slave of labsdb1005.eqiad.wmnet is now clouddb1001.clouddb-services.eqiad.wmflabs [production]
19:16 <arturo> T193264 delete VM clouddb-services-01 [clouddb-services]
18:54 <arturo> T193264 create VM clouddb-services-01 for PoC of running maintain-dbusers from here [clouddb-services]
18:34 <zhuyifei1999_> restarted webservice. it still has a phantom pod trusty-tools-909545302-jwrz7 at tools-worker-1010.tools.eqiad.wmflabs which refuses to terminate [tools.trusty-tools]
13:14 <XioNoX> add term labsdb_return to cloud-in4 - T216353 [production]
07:41 <wikibugs> Updated channels.yaml to: 62469f2db86d26c599400a55b9a7642ef95ce8d9 Update for Acme-chief project rename [tools.wikibugs]
07:21 <legoktm> deploying https://gerrit.wikimedia.org/r/491029 [releng]
07:10 <legoktm> Building image docker-registry.discovery.wmnet/releng/tox-acme-chief:0.3.4 [releng]
06:28 <legoktm> building new tox-acme-chief docker image https://gerrit.wikimedia.org/r/489725 [releng]
01:12 <Krinkle> beta-scap-eqiad (cron) failing with "sudo: a password is required" [releng]
2019-02-16 §
19:44 <Krinkle> Reloading Zuul to deploy https://gerrit.wikimedia.org/r/490937 / T216275) [releng]
17:23 <thcipriani> installed php7.0-curl on deployment-deploy01 (why was that suddenly necessary?) [releng]
16:26 <ariel@deploy1001> Finished deploy [dumps/dumps@8f83eea]: fix up multistream index file recombines for large files; better errors for misc dumps failures (duration: 00m 03s) [production]
16:25 <ariel@deploy1001> Started deploy [dumps/dumps@8f83eea]: fix up multistream index file recombines for large files; better errors for misc dumps failures [production]
14:21 <arturo> T194855 cloudvirt1020 is poweroff, waiting for disk setup before installing [production]
13:59 <arturo> T193264 switched clouddb1001/1004 to the new project local puppetmaster [clouddb-services]
13:54 <arturo> T193264 create 'clouddb10' puppet prefix to store puppet/hiera config for database servers in this project [clouddb-services]
13:47 <arturo> T193264 create 'clouddb-services-puppetmaster' puppet prefix to store puppet/hiera config for this project puppetmaster [clouddb-services]
13:43 <arturo> T193264 create 'clouddb-services-puppetmaster-01' instance [clouddb-services]
13:33 <arturo> add myself as user and projectadmin [clouddb-services]
05:00 <zhuyifei1999_> fixed by restarting flannel. another puppet run simply started kubelet [tools]
04:58 <zhuyifei1999_> puppet logs: https://phabricator.wikimedia.org/P8097. Docker is failing with 'Failed to load environment files: No such file or directory' [tools]
04:52 <zhuyifei1999_> copied the resolv.conf from tools-k8s-master-01, removing secondary DNS to make sure puppet fixes that, and starting puppet [tools]
04:48 <zhuyifei1999_> that host's resolv.conf is badly broken https://phabricator.wikimedia.org/P8096. The last Puppet run was at Thu Feb 14 15:21:09 UTC 2019 (2247 minutes ago) [tools]
04:44 <zhuyifei1999_> puppet is also failing bad here 'Error: Could not request certificate: getaddrinfo: Name or service not known' [tools]
04:43 <zhuyifei1999_> this one has logs full of 'Can't contact LDAP server' [tools]
04:41 <zhuyifei1999_> nslcd also broken on tools-worker-1005 [tools]
04:34 <zhuyifei1999_> uncordon tools-worker-1014.tools.eqiad.wmflabs [tools]
04:33 <zhuyifei1999_> the issue was, /var/run/nslcd/socket was somehow a directory, AFAICT [tools]
04:31 <zhuyifei1999_> then started nslcd vis systemctl and `id zhuyifei1999` returns correct stuffs [tools]
04:30 <zhuyifei1999_> `nslcd -nd` complains about 'nslcd: bind() to /var/run/nslcd/socket failed: Address already in use'. SIGTERMed a background nslcd, `rmdir /var/run/nslcd/socket`, and `nslcd -nd` seemingly starts to work [tools]
04:23 <zhuyifei1999_> drained tools-worker-1014.tools.eqiad.wmflabs [tools]
04:16 <zhuyifei1999_> logs: https://phabricator.wikimedia.org/P8095 [tools]
04:14 <zhuyifei1999_> restarting nslcd on tools-worker-1014 in an attempt to fix that, service failed to start, looking into logs [tools]
04:12 <zhuyifei1999_> restarting nscd on tools-worker-1014 in an attempt to fix seemingly-not-attached-to-LDAP [tools]
00:20 <XioNoX> add port 22 in cloud-in4 term labsdb [production]
2019-02-15 §
23:42 <bd808> Added BryanDavis (self), Arturo Borrero Gonzalez, Marostegui, and Jcrespo as admins in project [clouddb-services]
22:49 <bstorm_> created mariadb security group and lvs for a new database T193264 [clouddb-services]
22:49 <Joan> Restarted CVNBot3 (Last message was received on RCReader 5729.637672 seconds ago) [cvn]
20:40 <andrewbogott> enabled virtualization (all three settings) on cloudvirt1019 [production]
19:41 <arturo> T193264 reimaging cloudvirt1019 to get mitaka/stretch [production]
18:51 <arturo> T193264 icinga downtime cloudvirt1019 for 1 week [production]
18:44 <bstorm_> stopped replication and then mariadb on labsdb1004 [production]
18:18 <nuria> restarted turnilo in analytics-tool1002 [analytics]
17:28 <thcipriani> integration-slave-jessie-1002:/srv/jenkins-workspace/workspace$ `sudo rm -rf *` due to full disk [releng]
16:52 <cdanis> correction, needed to increment version; adding backported rasdaemon 0.6.0-1.2+deb8u2 to jessie-wikimedia [production]
16:48 <cdanis> adding backported rasdaemon 0.6.0-1.2+deb8u1 to jessie-wikimedia [production]
16:29 <bblack> reprepro: uploaded gdnsd-3.0.0-1~wmf1 to stretch-wikimedia [production]
16:28 <Lucas_WMDE> moved cronjob from trusty to stretch (following [[wikitech:News/Toolforge Trusty Move a cron job]]) [tools.wmde-access]
15:45 <moritzm> rebooting auth1001 for kernel security update [production]