Full Hacker News

Web Name: Full Hacker News

WebSite: http://www.fullhn.com

ID:11766

Keywords:

Full,Hacker,News,

Description:

Fond (YC W12) Is Hiring Customer Success Manager in Portland jobs.lever.co Comments Fond (fond.co) is a SaaS platform that seamlessly consolidates employee rewards and recognition processes into one easy-to-use solution. With Fond, employees and managers can recognize each other, redeem rewards, access exclusive corporate discounts, and measure success so HR departments spend less time managing programs and more time driving results. Some of our current customers include Facebook, Salesforce, and Visa, plus hundreds of others. Fond is a Y Combinator company and funded by investors including DCM, Andreessen Horowitz and SV Angel. If you want to be part of the team that delivers industry-leading engagement and recognition capabilities for top companies, Fond is for you!As a Customer Success Manager, you will be responsible for activities spanning from on-boarding, goal setting and metric management, to ensuring engagement, retention, and negotiating up-sells. You will have an in-depth understanding of customer's overall business needs and possess the ability to identify and articulate how Fond can enhance their business goals. Serving as an employee happiness expert, the right candidate is truly passionate about customer advocacy. This is an exciting opportunity to drive the success of our customers and, furthermore, directly impact the growth of Fond.At Fond, we celebrate uniqueness. We don’t discriminate on the basis of race, religion, color, nationality, gender, sexual orientation, age, marital status, veteran status, or disability status. Nor will we hold it against you if you prefer House Hunters to House of Cards or Cheez-Its to Goldfish. We welcome all types! Is venture capital producing the kinds of inventions society needs? www.technologyreview.com Comments Y Combinator advised Gray not to tell me how much funding he was seeking, because it looks bad if you don’t hit the mark. But his idea was built to appeal to investors. Other ideas he’d considered earlier were more like moonshots—hotels for homeless people, for example. “The challenge here is to build a business that does good and can raise money. You need to figure out how to monetize it,” Gray said. “If you can help people and they can pay for it, that’s the key.” For all his idealism, he had adapted to a venture system that has evolved to act as the spear tip of profit-seeking capitalism and American individualism. I asked Charley Ellis why he thought all these smart investors and entrepreneurs hadn’t put their time and money into health systems that could detect infectious diseases, or quicker ways to develop drugs and vaccines, or unemployment benefit systems that could cope with a sudden crush of applications. Ellis pointed out that people have a hard time seeing outside their universe. “People inside an industry are so focused on creating money for their industry,” he said. “Nobody wants to stop the game.” “The challenge here is to build a business that does good and can raise money ... If you can help people and they can pay for it, that’s the key.” Gray is definitely in the game. He lost his father, who worked on Wall Street, to cancer when he was a young teenager and then went to Columbia University, where he studied philosophy and astronomy. After he figured out that academia moved too slowly for him, he enrolled at Wharton, the University of Pennsylvania’s business school. This Ivy League pedigree gave him access to a world most entrepreneurs can’t dream of reaching. Adam Grant, a famous UPenn management professor, became an adviser to Ophelia and he discussed his idea with Tom McClellan, Barack Obama’s drug czar. Listening to Gray, it was hard not to think about the advantages wealth and connections offer. These benefits have been quantified by researchers who studied 1 million US patent holders and looked at their parents’ income. Low-income students who scored in the top 5% in math were no more likely to become inventors than below-average math students from affluent families, they found. Meanwhile, if women, minorities, and children from low-income families were to invent at the same rate as white men from families with incomes in the top 20%, the rate of innovation in America would quadruple. The advantages of wealth build on each other. Information is an important one: Gray knew from the beginning that he wanted to get into Y Combinator, which he’d heard about as a student. And getting into the accelerator, in turn, “de-risked and legitimized Ophelia,” he says. With that important stamp of approval, he was able to recruit a cofounder, Mattan Griffel, a more experienced entrepreneur who became his chief operating officer. Slow evolution Still, while Ophelia fits the traditional profile of an investable company for the likes of Y Combinator and the venture capitalists who go on to fund its startups, the industry has been changing, at least a little. Recent years have brought a new class of “impact investors,” who eschew the profit-obsessed venture capital model to focus on social good as well as high returns. And following a series of lawsuits and accusations of sexual harassment and discrimination, some new faces are getting a seat at the table.  Susan Choe, the founder of Katalyst Ventures, is an investor in Zipline, whose drones deliver medical supplies in poor countries where infrastructure is lacking. It’s valued at more than $1 billion. She also pointed me to All Raise, an organization that promotes women in venture capital. It reported in 2019 that a record 54 women became VC partners, though 65% of venture capital firms still have no female partners. “Change is being driven by the fear of being left behind,” says Choe, who says that limited partners—investors—in her funds include executives from outside the US. Millennials tend to be drawn toward more diverse teams, too, she says. She is among those who make the case that venture capital firms overlook products and services that cater to ignored communities or create new markets. “Investors are leaving money on the table, and they are missing innovation because the people that are running these VCs cannot relate to the preferences of people that are living outside their experiences,” says Lisa Green Hall, a fellow at Georgetown’s Beeck Center for Social Impact Innovation and former CEO of Calvert Impact Capital. “In the white male culture ... those cultures are extremely narrow. For women and people of color, those cultures are much more expansive.” It brought to mind Jasmine Edwards, a black woman from Tampa, Florida, who launched an education startup that aimed to help schools with low-income students find better substitute teachers. With 200 substitute teachers on the platform and three schools as paying customers, the startup ran out of time and cash, and it folded. What could have been different if she had been able to raise the funds she needed to continue? NICO ORTEGA What are you building? On April 18, Marc Andreessen emerged with another essay, this time occasioned by the pandemic and titled “It’s Time to Build.” He wrote: “Every step of the way, to everyone around us, we should be asking the question, what are you building? What are you building directly, or helping other people to build, or teaching other people to build, or taking care of people who are building? If the work you’re doing isn’t either leading to something being built or taking care of people directly, we’ve failed you, and we need to get you into a position, an occupation, a career where you can contribute to building.” He talked about skyscrapers and factories and said people should listen to Elon Musk. He called on everyone to build, although he didn’t make it clear what he would be building—or investing in—himself. (Andreessen declined to comment for this story.) I revisited the Andreessen Horowitz portfolio, which includes dozens of software winners, like Facebook, Box, Zynga, and Github, but not many companies building things that would have been useful in tackling the pandemic. One sunny day, I took my two daughters over to Arlington Cemetery, right outside Washington, DC, to leave sunflowers on my mom’s grave. The radio was buzzing over Musk’s announcement that his new baby would be called X Æ A-12. “Who would do that to their kid?” asked Quinn. “Don’t worry,” Lillie said. “X Æ A-12 Musk will be able to pay other kids not to bully him.” Before covid-19, I would have laughed off Andreessen’s bluster and Musk’s theatrics as inconsequential. But the pandemic made the gap between the world they live in and the world the rest of us inhabit seem even larger and more important. “I’m grateful for all my donations, because they were given by people who don’t have a lot to give. But it’s not $2.7 million.”Indeed, it has become clearer that things many people thought about life in America aren’t true. The nation wasn’t ready for a pandemic. It hasn’t made much progress on providing justice for all, as the riots provoked by police brutality in late May reminded us. And it is hard to claim that it remains the world’s most innovative economy. Software and technology are only one corner of the innovation playground, and the US has been so focused on the noisy kids in the sandbox that it has failed to maintain the rest of the equipment.  People who really study innovation systems “realize that venture capital may not be a perfect model” for all of them, says Carol Dahl, executive director of the Lemelson Foundation, which supports inventors and entrepreneurs building physical products. In the United States, she says, 75% of venture capital goes to software. Some 5 to 10% goes to biotech: a tiny handful of venture capitalists have mastered the longer art of building a biotech company. The other sliver goes to everything else—“transportation, sanitation, health care.” To fund a complete system of innovation, we need to think about “not only the downstream invention itself, but what preceded it,” Dahl says. “Not only inspiring people who want to invent, but thinking about the way products reach us through companies.” Dahl told me about a company that had developed reusable protective gear when Ebola emerged, and was now slowly ramping up production. What if it had been supported by venture funds earlier on? That’s not going to happen, Asheem Chandna, a partner at Greylock, a leading VC firm, told me: “Money is going to flow where returns are. If software continues to have returns, that’s where it will flow.” Even with targeted government subsidies that lower the risks for VCs, he said, most people will stick with what they know. So how can that change? The government could turn on the fire hose again, restoring that huge spray of investment that got Silicon Valley started in the first place. In his book Jump-Starting America, MIT professor Jonathan Gruber found that although total US spending on R D remains at 2.5% of GDP, the share coming from the private sector has increased to 70%, up from less than half in the early 1950s through the 1970s. Federal funding for R D as a share of GDP is now below where it was in 1957, according to the Information Technology and Innovation Foundation (ITIF), a think tank. In government funding for university research as a share of GDP, the US is 28th of 39 nations, and 12 of those nations invest more than twice the proportion the US does. In other words, the private sector, with its focus on fast profits and familiar patterns, now dominates America’s innovation spending. That, Dahl and others argue, means the biggest innovations cannot find their long paths to widespread adoption. We’ve “replaced breakthrough innovation with incremental innovation,” says Rob Atkinson, founder of the ITIF. And thanks to Silicon Valley’s excellent marketing, we  mistake increments for breakthroughs.  In his book, Gruber lists three innovations that the US has given away because it didn’t have the infrastructure to bring them to market: synthetic biology, hydrogen power, and ocean exploration. In most cases, companies in other countries commercialized the research because America’s way of investing in ideas hadn’t worked. The loss is incalculable. It is potentially enough to have started entire industries like Silicon Valley, perhaps in areas that never recovered after the 2008 recession, or communities that are being hardest hit by the coronavirus. World Bank economists determined that in 1900, Argentina, Chile, Denmark, Sweden, and the southern United States had similar levels of income but vastly differing capacities to innovate. This gap helped predict future income: the US and the Nordic countries sped ahead while Latin America lost ground. It’s been easy to dismiss people who say America is now more like a developing country than a developed one. But if the ability to solve society’s problems through innovation disappears, that may be the path it is on. Game over Despite being thrown into chaos because of covid-19, Y Combinator’s Demo Day turned out to be a success. More than 1,600 investors participated, up from the typical 1,000. Rather than being jammed into Pier 48 in San Francisco, investors logged on to a website where they saw a single-slide company summary, an eight- to 10-sentence description, and a three- to five-sentence team bio. Among the companies alongside Ophelia were Trustle, which gives parents access to a dedicated parenting and child development expert for $50, and Breezeful, which uses machine learning to find the best home mortgages. Usually, about 80% of companies at Demo Day receive funding within six months of the event. The accelerator says it’s too early to provide this year’s stats. But it was a happy result for Ophelia, which got $2.7 million from General Catalyst, Refactor Capital, and Y Combinator itself. Gray is aware that he landed the money when many face deep financial trouble. “It feels very strange,” he acknowledges. “But I felt and still feel extremely confident with what we’re building. The entire purpose of our business is to help people.” But in a game run by venture capital, the people you end up helping are the ones who can pay, so investors can make their money. In today’s America, that leaves out a lot of people. Nikki KingMATT EICH As I finished my reporting, a friend sent me an article about Nikki King, a young woman from Appalachia. She has more or less the same idea as Gray—providing medicine for addiction—but started out by focusing on her community. She runs a program in the courthouse in Ripley County, Indiana. In its first year, it treated 63 people, most of whom had not relapsed.  There’s no technology; broadband’s not so great in southern Indiana. She’s in a constant scramble for money, relying on grants, donations, and Medicaid reimbursements. I told her about Gray and his $2.7 million. “Rub it in, why don’t you?” she said. With that much money, she could run five programs. “In this community here, we raised between $50,000 and $70,000,” she said. “I’m grateful for all my donations, because they were given by people who don’t have a lot to give. But it’s not $2.7 million.” We are making two changes to our standard deal in conjunction with a recent fundraise. Starting with the Winter 2021 batch, our deal will be $125,000 for 7% equity on a post-money safe, and we will reduce the amount of our pro rata right to 4% of subsequent rounds. In the coming years, this will enable us to fund as many as 3000 more companies.For background, YC originally gave about $20,000 for 6% of a company. In 2011, Yuri Milner and SV Angel began offering an additional $150k to every startup in YC. We continued this program with new investors and reduced the deal to about $100k for 7%. In 2014, we increased that amount to $120k and in 2018 to $150k when we raised our last fund.We have changed our deal several times over the years as we have raised new funds, modified budgets, and to match the current environment and economy. We do not expect this to be the last time we change the deal, but we do feel this is the right place to be for the next several years. Comments Dynamic linkingDo your installed programs share dynamic libraries?Findings: not reallyOver half of your libraries are used by fewer than 0.1% of your executables.Number of times each dynamic library is required by a programlibs.awk/\t.*\.so.*/ { n=split($1, p, "/") split(p[n], l, ".") lib=l[1] if (libs[lib] == "") { libs[lib] = 0 libs[lib] += 1END { for (lib in libs) { print libs[lib] "\t" libUsage$ find /usr/bin -type f -executable -print \ | xargs ldd 2 /dev/null \ | awk -f libs.awk \ | sort -rn results.txt$ awk '{ print NR "\t" $1 }' results.txt nresults.txt$ gnuplotgnuplot plot 'nresults.txt'my results$ find /usr/bin -type f -executable -print | wc -l$ head -n20 results.txt4496 libc4484 linux-vdso4483 ld-linux-x86-642654 libm2301 libdl2216 libpthread1419 libgcc_s1301 libz1144 libstdc++805 liblzma785 librt771 libXdmcp771 libxcb771 libXau755 libX11703 libpcre667 libglib-2658 libffi578 libresolv559 libXextIs loading dynamically linked programs faster?Findings: definitely notLinkage Avg. startup time Dynamic 137263 ns Static 64048 ns#include stdio.h #include stdlib.h #include time.h #include unistd.h int main(int argc, char *argv[]) { struct timespec ts; clock_gettime(CLOCK_MONOTONIC, ts); fprintf(stdout, "%ld\t", ts.tv_nsec); fflush(stdout); if (argc == 1) { char *args[] = { "", "", NULL }; execvp(argv[0], args); } else { fprintf(stdout, "\n"); return 0;test.sh#!/bin/shwhile [ $i -lt 1000 ] ./ex i=$((i+1))My results$ musl-gcc -o ex ex.c$ ./test.sh | awk 'BEGIN { sum = 0 } { sum += $2-$1 } END { print sum / NR }'137263$ musl-gcc -static -o ex ex.c$ ./test.sh | awk 'BEGIN { sum = 0 } { sum += $2-$1 } END { print sum / NR }'64048Wouldn't statically linked executables be huge?Findings: not reallyOn average, dynamically linked executables use only 4.6% of the symbols onoffer from their dependencies. A good linker will remove unused symbols.% of symbols requested by dynamically linked programs from the libraries that it depends onnsyms.gopackage mainimport ( "bufio" "fmt" "os" "os/exec" "path/filepath" "strings"func main() { ldd := exec.Command("ldd", os.Args[1]) rc, err := ldd.StdoutPipe() if err != nil { panic(err) ldd.Start() var libpaths []string scan := bufio.NewScanner(rc) for scan.Scan() { line := scan.Text()[1:] /* \t */ sp := strings.Split(line, " ") var lib string if strings.Contains(line, "= ") { lib = sp[2] } else { lib = sp[0] if !filepath.IsAbs(lib) { lib = "/usr/lib/" + lib libpaths = append(libpaths, lib) ldd.Wait() rc.Close() syms := make(map[string]interface{}) for _, path := range libpaths { objdump := exec.Command("objdump", "-T", path) rc, err := objdump.StdoutPipe() if err != nil { panic(err) objdump.Start() scan := bufio.NewScanner(rc) for i := 0; scan.Scan(); i++ { if i 4 { continue line := scan.Text() sp := strings.Split(line, " ") if len(sp) 5 { continue sym := sp[len(sp)-1] syms[sym] = nil objdump.Wait() rc.Close() objdump := exec.Command("objdump", "-R", os.Args[1]) rc, err = objdump.StdoutPipe() if err != nil { panic(err) objdump.Start() used := make(map[string]interface{}) scan = bufio.NewScanner(rc) for i := 0; scan.Scan(); i++ { if i 5 { continue sp := strings.Split(scan.Text(), " ") if len(sp) 3 { continue sym := sp[len(sp)-1] used[sym] = nil objdump.Wait() rc.Close() if len(syms) != 0 len(used) != 0 len(used) = len(syms) { fmt.Printf("%50s\t%d\t%d\t%f\n", os.Args[1], len(syms), len(used), float64(len(used)) / float64(len(syms)))Usage$ find /usr/bin -type f -executable -print | xargs -n1 ./nsyms results.txt$ awk '{ n += $4 } END { print n / NR }' results.txtmy resultsWill security vulnerabilities in libraries that have been statically linked cause large or unmanagable updates?Findings: not reallyNot including libc, the only libraries which had "critical" or "high" severityvulnerabilities in 2019 which affected over 100 binaries on my system were dbus,gnutls, cairo, libssh2, and curl. 265 binaries were affected by the rest.The total download cost to upgrade all binaries on my system which were affectedby CVEs in 2019 is 3.8 GiB. This is reduced to 1.0 GiB if you eliminate glibc.It is also unknown if any of these vulnerabilities would have been introducedafter the last build date for a given statically linked binary; if sothat binary would not need to be updated. Many vulnerabilities are also limitedto a specific code path or use-case, and binaries which do not invoke that codepath in their dependencies will not be affected. A process to ascertain thisinformation in the wake of a vulnerability could be automated.arch-securityextractcves.pyimport email.utilsimport mailboximport reimport shleximport timepacman_re = re.compile(r'pacman -Syu .*')severity_re = re.compile(r'Severity: (.*)')mbox = mailbox.mbox("arch-security.mbox")for m in mbox.items(): m = m[1] date = m["Date"] for part in m.walk(): if part.is_multipart(): continue content_type = part.get_content_type() [charset] = part.get_charsets("utf-8") if content_type == 'text/plain': body = part.get_payload(decode=True).decode(charset) break pkgs = pacman_re.findall(body) severity = severity_re.findall(body) date = email.utils.parsedate(date) if len(pkgs) == 0 or date is None: continue if date[0] = 2018 or date[0] 2019: continue severity = severity[0] args = shlex.split(pkgs[0]) pkg = args[2].split(" =")[0] print(pkg, severity)$ python3 extractcves.py | grep Critical cves.txt$ xargs pacman -Ql cves.txt | grep \\.so | awk '{print $1}' | sort -u affected.txt# Manually remove packages like Firefox, Thunderbird, etc; write remainder.txt$ xargs pacman -Ql remainder.txt | grep '/usr/lib/.*.so$' | awk '{ print $2 }' libs.txt$ ldd /usr/bin/* ldd.txt$ ./scope.sh libs.txt | sort -nr sobjects.txtsobjects.txt is a sorted list of shared objects andthe number of executables that link to them. To find the total size of affectedbinaries, I ran the following command:# With libc$ egrep -la 'libc.so|libm.so|libdl.so|libpthread.so|librt.so|libresolv.so|libdbus-1.so|libgnutls.so|libcairo.so|libutil.so|libssh2.so|libcurl.so|libcairo-gobject.so|libcrypt.so|libspice-server.so|libarchive.so|libSDL2-2.0.so|libmvec.so|libmagic.so|libtextstyle.so|libgettextlib-0.20.2.so|libgettextsrc-0.20.2.so|libMagickWand-7.Q16HDRI.so|libMagickCore-7.Q16HDRI.so|libbfd-2.34.0.so|libpolkit-gobject-1.so|libwebkit2gtk-4.0.so|libjavascriptcoregtk-4.0.so|libpolkit-agent-1.so|libgs.so|libctf.so|libSDL.so|libopcodes-2.34.0.so|libQt5WebEngine.so|libQt5WebEngineCore.so|libctf-nobfd.so|libcairo-script-interpreter.so' /usr/bin/* | xargs wc -c# Without libc$ egrep -la 'libdbus-1.so|libgnutls.so|libcairo.so|libssh2.so|libcurl.so|libcairo-gobject.so|libcrypt.so|libspice-server.so|libarchive.so|libSDL2-2.0.so|libmvec.so|libmagic.so|libtextstyle.so|libgettextlib-0.20.2.so|libgettextsrc-0.20.2.so|libMagickWand-7.Q16HDRI.so|libMagickCore-7.Q16HDRI.so|libbfd-2.34.0.so|libpolkit-gobject-1.so|libwebkit2gtk-4.0.so|libjavascriptcoregtk-4.0.so|libpolkit-agent-1.so|libgs.so|libctf.so|libSDL.so|libopcodes-2.34.0.so|libQt5WebEngine.so|libQt5WebEngineCore.so|libctf-nobfd.so|libcairo-script-interpreter.so' /usr/bin/* | xargs wc -cTest environmentArch Linux, up-to-date as of 2020-06-25 2188 packages installed gcc 10.1.0 Santa Cruz, California bans predictive policing in U.S. first www.reuters.com Comments NEW YORK (Thomson Reuters Foundation) - As officials mull steps to tackle police brutality and racism, California’s Santa Cruz has become the first U.S. city to ban predictive policing, which digital rights experts said could spark similar moves across the country. “Understanding how predictive policing and facial recognition can be disportionately biased against people of color, we officially banned the use of these technologies in the city of Santa Cruz,” Mayor Justin Cummings said on Wednesday. His administration will work with the police to “help eliminate racism in policing”, the seaside city’s first male African-American mayor said on his Facebook page, following a vote on Tuesday evening. Used by police across the United States for almost a decade, predictive policing relies on algorithms to interpret police records, analyzing arrest or parole data to send officers to target chronic offenders, or identifying places where crime may occur. But critics says it reinforces racist patterns of policing - low-income, ethnic minority neighbourhoods have historically been overpoliced so the data shows them as crime hotspots, leading to the deployment of more police to those areas. “As Santa Cruz rightly recognized, predictive policing and facial recognition are dangerous, racially biased technologies that should never be used by our government,” said Matt Cagle, a lawyer with the ACLU. PredPol Inc, the Santa Cruz-headquartered firm that pioneered the technology, said that it supported the city resolution’s requirement that predictive policing “will not perpetuate bias”, among other criteria. “Given the institutionalized state of racial inequality in America, this is a legitimate filter to be applied to any new technology acquired by a public entity, whether used for public safety or not,” it said on Twitter on Tuesday. Boston’s city council on Wednesday voted to ban face surveillance technology, a move also welcomed by digital rights activists. “Lawmakers across the country have a responsibility to step up and dismantle surveillance systems that have long been used to repress activism, target communities of color, and invade people’s private lives,” Cagle said in emailed comments. Roya Pakzad, a Santa Cruz-based researcher who founded human rights group Taraaz, said campaigners would continue to push for more limits on police use of technology, including requiring them to seek community approval of any new surveillance tool. “What is still missed in this ordinance is a requirement for public transparency and oversight,” she told the Thomson Reuters Foundation. Reporting by Avi Asher-Schapiro @AASchapiro, Editing by Zoe Tabary. Please credit the Thomson Reuters Foundation, the charitable arm of Thomson Reuters, that covers the lives of people around the world who struggle to live freely or fairly. Visit http://news.trust.orgOur Standards:The Thomson Reuters Trust Principles.One of the challenges of application engineering within an established company like Dropbox is to break out of the cycle of incremental improvements and look at a problem fresh. Our colleagues who do user research help by regularly reminding us of the customer’s perspective, but so can our friends and family when they complain that product experiences aren’t as simple as they could be.One such complaint, in fact, led to a new product now available to all our users called Dropbox Transfer. Transfer lets Dropbox users quickly send large files, and confirm receipt, even if the recipient isn’t a Dropbox user. You could already do most of this with a Dropbox shared link, but what you couldn’t do before Transfer turned out to be significant for many of our users. For instance, with a shared link, the content needs to be inside your Dropbox folder, which affects your storage quota. If you are just one large video file away from being over quota, sending that file presents a challenge. And one of the benefits of a shared link is that it’s always connected to the current version of the file. This feature, however, can become a hassle in cases where you want to send a read-only snapshot of a file instead of a live-updating link.The more we dug into it, the more we realized that file sharing and file sending have very different use cases and needs. For file transfers, it’s really helpful to get a notification when a recipient has downloaded their files. This led us to provide the sender with a dashboard of statistics about downloads and views, prompting them to follow up with their recipient if the files are not retrieved. And unlike sharing use cases where link persistence is the expected default, with sending cases many people prefer the option of ephemeral expiring links and password protection, increasing the security of confidential content and allowing a “send and forget” workflow.Because of these differences we chose to build an entirely new product to solve these sending needs, rather than overcomplicating our existing sharing features. Listening to the voices of people around us (whether Dropbox users or not) helped us break away from preconceived notions based on what is easy and incremental to build on top of the Dropbox stack.This blog is the story of how Transfer was built from the point of view of its lead engineer, from prototyping and testing to production.As software engineers, we’re used to optimizing. Engineers have our fingerprints all over a piece of software. Things like performance, error states, and device platform strategies (what devices we support) are disproportionately decided by engineers. But what outcomes are we optimizing for? Do we focus on performance or flexibility? When aggressive deadlines hit, which features should be cut or modified? These judgements of cost vs. value are often made by an engineer hundreds of times in a typical month.To correctly answer these optimization questions, engineers must know our customers.Research is all around usIdeationProduct development, as with machine learning, follows either a deductive or inductive reasoning path. In machine learning there are two major methods of training: supervised (deductive), and unsupervised (inductive). Supervised training starts with a known shape of inputs and outputs: e.g. trying to curve fit a line. Unsupervised learning attempts to draw inferences from data: e.g. using datapoint clustering to try to understand what questions to ask.In deductive product development, you build a hypothesis, “users want x to get y done,” and then validate with prototyping and user research. The inductive approach observes that “users are exhibiting x behavior,” and then asks, “what is the y we should build?” We built Transfer with the first approach and are refining it with the second. I will focus on the first in this post.So how does one come up with a hypothesis to test? There are many ways to come up with these initial seeds of ideas: open-ended surveys, rapid prototyping and iteration, focus groups, and emulation and combination of existing tools or products. Less often mentioned within tech circles is the easiest method of all: observing and examining your surroundings. Fortunately, at Dropbox, problems that need solving are not hard to find. Research is all around us, because the audience is essentially anyone with digital content. If we listen, we can let them guide us and sense-check our path.This is how Transfer got its start. My partner complained to me that they never could use Dropbox despite my evangelizing. It was simply too hard to use for simple sending. “Why are there so many clicks needed? Why does it need to be within a Dropbox folder to send? Why do I have to move all the files I uploaded into a folder?” When I heard we might be exploring this problem, I jumped at the opportunity. At the very least, I might be able to persuade my exacting sweetheart to use Dropbox! As the product progressed, I gained more confidence: I wasn’t sure if my accountant had received the files I sent with an email, a videographer friend wanted a quick way to send his raws over to an editor for processing. What started as a personal quest to persuade my partner quickly became a very global effort. Turns out she isn’t the only one who wants a new one-way sending paradigm within Dropbox. Not all tools are as general-purpose as Transfer, but overall, listening closely to people’s needs and feedback can quickly give directional guidance. For me, personally, it amplified my confidence that Transfer can have a large impact.This is one of the reasons, after five years, that I keep working here: Dropbox users are everywhere. My dad in his human biology research lab storing microscope images and files containing RNA; my neighbor storing contracts in Dropbox so they can read them on-the-go; some DJ in a club, choosing what track to queue up next using our audio-previews. Being a part of the fabric of everyday people’s lives is an incredible privilege.TL;DR: If you’re not sure if something makes sense, just ask a friend or two who might be in the target audience as a sense-check.Path to validationAfter these initial few sparks, from my experiences and the experiences and research of those on the team, we were ready to test out the idea. We attempted to clearly and strongly define the idea to either be right, or completely wrong. We did not want inconclusive results here, as that would waste us months or years of time. We set out to prove or disprove that, “Dropbox users need a quicker way to send files, with set-and-forget security features built in, like file expiration.”We started with an email and a sample landing page test: would people even be interested in this description? Turned out they were. Then, curious about actual user behavior, we graduated to a prototype with all the MVP features. In parallel, we ran a set of surveys, including one based around willingness-to-pay to make sure there was a market out there. Later on we started monitoring a measure of product-market-fit as we released new features (more on this later).As an engineer, it’s important to always understand this hypothesis and feel empowered to push back and suggest cutting scope if a feature doesn’t bubble up to the core hypothesis. This helps product and design hone their mission, users have a cleaner experience, and engineers reduce support cost for features that only 0.1% of users will ever use. A common trap of the engineering “can-do” attitude is enabling feature creep, and eventually a hard-to-manage codebase and a cluttered product. As with product and design, code should seek to be on-message, with the strongest and most central parts corresponding to the most utilized and critical components to an experience.When an engineer starts optimizing for the customer and their use-case, the underlying technology and approach becomes bound to the spectrum of their needs.Code quality as a spectrumEvery good engineering decision is made up of a number of inputs. These are things like:complexity to buildresource efficiencylatencycompatibility with existing team expertisemaintainabilityGiven these traditional criteria, engineers might often fall into the trap of over-optimizing and unwittingly targeting problems the customer doesn’t care about.Making sure to always add customer experience to these engineering inputs will help align efforts to deploy the highest code quality on the highest customer-leverage areas. If you consider a user’s experience to be an algorithm, this really is just a riff on the classic performance wisdom that comes out of Amdahl’s law: focus on optimizing the places where customers are spending (or want to spend) the most valuable time.Remember: Hacky technical solutions can be correct engineering solutions. Optimizing the quality of unimportant parts will only lead to unimportant optimizations.Please note: I’m not advocating for writing a lot of messy fire-prone code, just for staying aware of the big picture at all times.We built a product we planned to delete in 2 months.Why not just build the actual thing?When exploring new product spaces, it is unclear where the user need (and/or revenue) lies. It’s usually helpful to decouple product learning from sustainability. When building a completely new surface, the optimized solutions for each of these are usually never the same.Learning: Optimize for flexibility. Do whatever it takes to show something valuable to a small set of users. This type of code might not even be code, but rather clickable prototypes built in tools like Figma.Sustainability: Optimize for longer-term growth. This type of code might include things like clearly-delineated, less-optimized “crumple zones” that can be improved as the product scales and needs to be more efficient. It should also include aspirational APIs compatible with extensions such as batching or pagination.How we did itSmoke and mirrors. We took an existing product, forked part of the frontend and applied a bunch of new CSS to make an existing product based around galleries become a “new” one based around a list of files. Only a few backend changes were needed.Mindful of its eventual removal, we surrounded all the prototype code with comment blocks like:So we could quickly clean up after we were done.Results? After a month with it, people were sad to see it go, a sentiment we quantified with the Sean Ellis score. Sad enough to see it go that we had to take this to part II.When it came time to tear down the temporary product—a prototype of hacks built on more hacks—our team needed to decide how we’d build the real thing.Fortunately, Transfer is built on the concept of files, and files are something that Dropbox does well regardless of what they’re being used for. Our efficient and high-quality storage systems, abuse prevention, and previews pipelines, optimized over many years, fit directly into our product. Unfortunately, moving up the stack, the sharing and syncing models could not be reused. While the underlying technology behind sharing and the sync engine has been rebuilt and optimized over the years (with the most recent leap in innovation being our Nucleus rewrite), the product behavior and sharing model had remained largely unchanged for the last 10 years.The majority of the sharing stack assumed that files uploaded to Dropbox would always be accessible inside of Dropbox and take up quota. Additionally, there was also an assumption that links would refer to live updating content, rather than a snapshot of the file at link creation time.For the file sync infrastructure, there was an assumption of asynchronous upload: the idea that content would eventually be added to the server. For a product that was meant to immediately upload and share content while the user waits, the queuing and eventual consistency concept that had worked for file syncing would be disastrous for our user experience.While sync and share seemed homologous to sending, their underlying technologies had many years of product behavior assumptions baked in. It would take much longer than the seven months of development time we had to relax these, so we chose to rebuild large pieces of these areas of the stack rather than adapt the existing code (while leveraging our existing storage infrastructure as-is). The response to our prototype had given us the conviction to take this harder technical path in order to provide a specific product experience, rather than change the product experience to match the shortest and most immediately-compatible technical approach.It’s important to note that each decision to “do it ourselves” was done in conversation with the platform teams. We simply needed things that were too far-out and not validated enough to be on their near-term roadmap. Now that Transfer has proven to be successful, I’m already seeing the amount of infrastructure code our product-team owns shrinking, as our product partners add flexibility into their systems to adapt to our use-cases. In lieu of taking a hard dependency on our platform-partners, we were able reduce temporary inter-team complexity and accelerate our own roadmap by building our own solution. Our habit of choosing to actively reduce cross-team dependencies also proved essential in hitting our goals.When working in established codebases, here are some tips to keep things moving fast:Be creativeSimilar products are closer than you think. In our case, we found that sending photo albums had many similarities with what we were trying to do. This cut off months of development time as we were able to leverage some ancient, but serviceable and battle-tested, code to back our initial sharing model.Always ask about scaleAt large companies processes are often developed to work at the scale of their largest product. When working on new projects with user bases initially many orders-of-magnitude smaller than a core product, always start a meeting with another team by telling them your expected user base size in the near future. Teams might assume you’re operating at “main product scale” and base guidances around this. It’s your job to make this clear. This can save your team from never launching a product because they’re too busy solving for the 200M user case before solving for the 100 user one.Learn about users other than yourselfOne thing we did early on was to build a set of internationalized string components. This took us extra time initially, but, armed with the knowledge that roughly half of all Dropbox users speak a language other than English, we knew the user impact would be well worth our time. One of our prouder engineering moments was when we were about to launch to the initial alpha population and got to tell the PM, who had assumed we hadn’t translated the product, that we should include a global population in the pre-release group. She was ecstatic the engineers had self-organized and decided this needed to be done.Know what can change and what can’tSometimes things just can’t be done in a timely fashion. If they don’t contribute to the core identity of the product, consider letting them go. For us, this was the decision to initially not support emailing Transfers. Sending by copying a link was good enough for beta.Always know where you are and where you want to beWhen reviewing the specs for wide-reaching changes or reading the new code itself it’s useful to ask two questions:Where is this going? What is the ideal state of this?Where along this path is this? What number of stepping-stones should I expect to be looking at?We would constantly step back from decisions around what’s important and what’s not in terms of building the core identity into the product. We’d also constantly assess how easy it would be to change our minds if we had to (Type I vs Type II decisions, in Jeff Bezos’ lingo).Some of the hardest calls were around our core, critical path code, code that would process hundreds of thousands of gigabytes per month. These were for us, the (initial) file uploader and our underlying data storage model, inherited code which was neither the cleanest nor the best unit-tested. Due to time and resource constraints, we had to settle for simply “battle tested” and “present” over other factors.The file uploader we chose for the web version of Transfer was the same uploader used on a number of other areas of the website, primarily file browse and file requests. This uploader was based on a 2012 version of a third-party library called PLupload, a very full-featured library providing functionality and polyfills that would go unused by our product. Since this uploader worked, it was hard to justify a rewrite during the initial product construction. However, as this library (at least the 2012 version of it) was heavily event driven and mutates the DOM, it immediately started causing reliability issues when wrapped inside a React component. Strange things started happening: items would get randomly stuck while uploading during our bug-bashes. Long-running uploads would error due to DOM nodes disappearing suddenly, causing a cascade of node deletions as React tried to push the “hard-reset” button on the corrupted DOM tree. We chose to keep it, but as abstracted away as possible.We took a similar approach to the Nucleus migration: we started out by building an interface exposing every feature of PLupload we wanted to use. This interface consisted of our own datatypes rather than PLupload’s. This served two roles:Testing got much better as we had a boundary. We were able to both dependency inject a mock library to test the product code, and also connect the inner code to a test harness with clear expectations around I/O of each method.The added benefit of this boundary was that it would eventually act as a shim when we had time to swap out the library with something simpler. This also forced us to come up with our requirements for a rewrite ahead of time, greatly increasing the productivity of the rewrite.The underlying sharing code we chose was based not around the well-maintained Dropbox file and folder sharing links, but rather a much older link type created initially for sharing photo albums. These album links allowed us to quickly stand up functional Transfer links. The benefit was that we were adapting a known system: other teams knew what these links were. Customer experience team was able to reuse playbooks and guides surrounding these links, Security and Abuse teams already had threat models and monitoring on these links, and the team owning the sharing infrastructure already had context. By not having to build new infrastructure, we were able to reduce variables, allowing us to focus more on product development than foundational changes.To allow us to migrate to a new system later, as with the web uploader, we wrapped this ancient set of helpers in our own facades.As our scaling up and launch played out, it became clear we had made the correct architecture calls: These parts had held and performed. Good engineering can be as much about action as it is about restraint. When we did revisit these later, we had the space to take a more thoughtful and holistic approach than we would have months earlier.Note: In early 2020 we migrated entirely off storing our data in the photo-album system, giving us both reliability, maintainability, and performance improvements.A strong culture of discourseEach crossroad can be critical. Having a culture of inclusion where each voice is considered based on an argument’s merit is an essential component of making key technical calls correctly. When core questions like the ones above come up, answering them incorrectly can lead to long detours, future tech debt, or even product collapse. Sometimes maintaining a badly-built product is more costly than the value it creates.In the specific case of reusing the photo album code, one of my teammates vehemently opposed the idea of taking on the tech debt from this system. The ensuing discussion over the course of a few weeks resulted in a number of proposals, documents, and meetings that uncovered and evaluated the time commitments required for different alternatives. Although we chose to take the photo album code with us to GA, the points raised through these meetings galvanized the short-term approach, backfilling the thoughtfulness that was either unspoken or lacking in its initial proposal, and brought the team together on a unified short-and-long-term vision for our sharing model’s backing code. These meetings helped set the eventual roadmap for complete migration off of the system.Without a well-knit team motivated to speak their mind at each intersection, the quality of these decisions can grow weak.I was lucky enough to work with 10 amazing engineers in a culture of open discourse on phase II of this project. I remember many afternoons spent working with the team to find our best path forward, balancing concerns from one engineer’s lens and another’s. Throughout, the glue that kept us moving forward through these discussions was the user need. Before each meeting, we’d try to articulate the “so what” of it all, and at the end try to bring it back again to the user. Whether this was a due-date we needed to hit for a feature or a level of quality expected, we could all align around the user as our ultimate priority. When we disagree, we ask “What would the user want?” and use that as our compass.Keeping that customer voice growing in each of us, through things like subscribing to the feedback mailing list or participating in user research interviews has proved crucial to our success as an engineering team. It is not enough to just have product managers, designers, and researchers thinking about the customer, engineers must as well. The value of Tor and anonymous contributions to Wikipedia blog.torproject.org Comments Tor users are conscientious about the tools they pick to do what they do online. Often, discussions of controversial topics need a different level of privacy depending on a user's threat models. An activist in the Middle East can provide a different perspective on an article about politics in their own country than a collaborator in northern Europe. And they deserve to add their voices to the conversation safely. There are many reasons a person might want to be anonymous when they write, edit, or share information. But some web services, including Wikipedia, ban (or have banned) Tor users from participating, effectively banning anonymous contributors. According to a recently published research paper co-authored by researchers from Drexel, NYU, and the University of Washington, Tor users make high-quality contributions to Wikipedia. And, when they are blocked, as doctoral candidate Chau Tran, the lead author describes, "the collateral damage in the form of unrealized valuable contributions from anonymity seekers is invisible." The authors of the paper include Chau Tran (NYU), Kaylea Champion (UW CDSC), Andrea Forte (Drexel), Benjamin Mako Hill (UW CDSC), and Rachel Greenstadt (NYU). The paper was published at the 2020 IEEE Symposium on Security Privacy between May 18 and 20. By examining more than 11,000 Wikipedia edits made by Tor users able to bypass Wikipedia's Tor ban between 2007 and 2018, the research team found that Tor users made similar quality edits to those of IP editors, who are non-logged-in users identified by their IP addresses, and first-time editors. The paper notes that Tor users, on average, contributed higher-quality changes to articles than non-logged-in IP editors. The study also finds that Tor-based editors are more likely than other users to focus on topics that may be considered controversial, such as politics, technology, and religion. Related research implies Tor users are quite similar to other internet users, and Tor users frequently visit websites in the Alexa top one million. The new study findings make clear how anonymous users are raising the bar on community discussions and how valuable anonymity is to avoid self-censorship. Anonymity and privacy can help protect users from consequences that may prevent them from interacting with the Wikipedia community. Wikipedia has tried to block users coming from the Tor network since 2007, alleging vandalism, spam, and abuse. This research tells a different story: that people use Tor to make meaningful contributions to Wikipedia, and Tor may allow some users to add their voice to conversations in which they may not otherwise be safely able to participate. Freedom on the internet is diminishing globally, and surveillance and censorship are on the rise. Now is the time to finally allow private users to safely participate in building collective knowledge for all humanity. More info: George TodericiGoogle Research Michael TschannenGoogle Research Eirikur AgustssonGoogle Research About We combine Generative Adversarial Networks with learned compression to obtain a state-of-the-art generative lossy compression system. In the paper, we investigate normalization layers, generator and discriminator architectures, training strategies, as well as perceptual losses. In a user study, we show that our method is preferred to previous state-of-the-art approaches even if they use more than 2× the bitrate. Best viewed on a big screen. Demo Interactive Demo comparing our method (HiFiC) to JPG or BPG: The following shows normalized scores for the user study, compared to perceptual metrics, where lower is better for all. HiFiC is our method. M S is the deep-learning based Mean Scale Hyperprior, from Minnen et al., optimized for mean squared error. BPG is a non-learned codec based on H.265 that achieves very high PSNR. No GAN is our baseline, using the same architecture and distortion as HiFiC, but no GAN. Below each method, we show average bits per pixel (bpp) on the images from the user study, and for learned methods we show the loss components. The study shows that training with a GAN yields reconstructions that outperform BPG at practical bitrates, for high-resolution images. Our model at 0.237bpp is preferred to BPG even if BPG uses 2.1× the bitrate, and to MSE optimized models even if they use 1.7× the bitrate. title={High-Fidelity Generative Image Compression}, author={Mentzer, Fabian and Toderici, George and Tschannen, Michael and Agustsson, Eirikur}, journal={arXiv preprint arXiv:2006.09965}, year={2020} Carving out a niche as a small artist on Spotfiy www.stevebenjamins.com Comments So how do you get on editorial playlists? I sincerely have no clue. I've been on editorial playlists 16 times but I have no idea how to replicate that. Note: In my experience, major label artists have an easier time getting on editorial playlists. This is discouraging— if major label artists have an easier time getting on editorial playlists, they’ll also get better placement in algorithmic playlists… But major label artists also only typically get 13% - 20% of streaming royalties… so there are still plenty of reasons to stay indie! 5. Algorithms Make For Looser Relationships Between Artists And Listeners"Music itself is going to become like running water or electricity." - David BowieMy most popular song 'Circles' has been played 1,350,00 times. And every month about 65,000 people listen to my music on Spotify. Guess how many people follow me on Instagram? 480. Just because people listen to me on Spotify, doesn’t mean they want a deep relationship. Most listeners just add 'Circles' to their library and move on with their life.I’m definitely okay with this. I’d rather listeners follow me on Spotify rather than Instagram anyways. Plus only good things can come out of de-coupling fame from music. Make music because it’s what you love to do— not because you want to get famous. If you want to get famous I think you’re better off as a Youtuber or Tiktok influencer anyways! I think this is just the nature of algorithmic playlists— they lead to a high volume of listeners with a very loose connection to you as an artist: Mobilewalla used cellphone data to estimate the demographics of protesters www.buzzfeednews.com Comments Data company Mobilewalla used cellphone information to estimate the demographics of protesters. Sen. Elizabeth Warren says it’s “shady” and concerning. Posted on June 25, 2020, at 2:40 p.m. ET Demonstrators march outside of the state capital building on May 31, 2020 in Saint Paul, Minnesota as they protest the death of George Floyd. (Photo by KEREM YUCEL/AFP via Getty Images)On the weekend of May 29, thousands of people marched, sang, grieved, and chanted, demanding an end to police brutality and the defunding of police departments in the aftermath of the police killings of George Floyd and Breonna Taylor. They marched en masse in cities like Minneapolis, New York, Los Angeles, and Atlanta, empowered by their number and the assumed anonymity of the crowd. And they did so completely unaware that a tech company was using location data harvested from their cellphones to predict their race, age, and gender and where they lived.Just over two weeks later, that company, Mobilewalla, released a report titled "George Floyd Protester Demographics: Insights Across 4 Major US Cities." In 60 pie charts, the document details what percentage of protesters the company believes were male or female, young adult (18–34); middle-aged 35º54, or older (55+); and "African-American," "Caucasian/Others," "Hispanic," or “Asian-American.” "These companies can even sell this data to the government, which can use it for law and immigration enforcement." "African American males made up the majority of protesters in the four observed cities vs. females,” Mobilewalla claimed. “Men vs. women in Atlanta (61% vs. 39%), in Los Angeles (65% vs. 35%), in Minneapolis (54% vs. 46%) and in New York (59% vs. 41%)." The company analyzed data from 16,902 devices at protests — including exactly 8,152 devices in New York, 4,527 in Los Angeles, 2,357 in Minneapolis, and 1,866 in Atlanta.Sen. Elizabeth Warren told BuzzFeed News that Mobilewalla’s report was alarming, and an example of the consequences of the lack of regulation on data brokers in the US.“This report shows that an enormous number of Americans – probably without even knowing it – are handing over their full location history to shady location data brokers with zero restrictions on what companies can do with it,” Warren said. “In an end-run around the Constitution's limits on government surveillance, these companies can even sell this data to the government, which can use it for law and immigration enforcement. That's why I've opened an investigation into the government contracts held by location data brokers, and I’ll keep pushing for answers.” Screenshot from "George Floyd Protester Demographics: Insights Across 4 Major US Cities."It’s unclear how accurate Mobilewalla’s analysis actually is. But Mobilewalla's report is another revelation from a wild west of obscure companies with untold amounts of sensitive information about individuals — including where they go and what their political allegiances may be. There are no federal laws in place to prevent this information from being abused.Mobilewalla CEO Anindya Datta told BuzzFeed News that the data analysis that made the George Floyd Protester Demographics possible wasn’t a new kind of project. “The underlying data, the underlying observations that came into the report, is something that we collect and produce on a regular basis,” he saidDatta said Mobilewalla didn’t prepare the report for law enforcement or a public agency, but rather to satisfy its own employees' curiosity about what its vast trove of unregulated data could reveal about the demonstrators. Datta told BuzzFeed News that the company doesn’t plan to include information about whether a person attended a protest to its clients, or to law enforcement agencies.“It’s hard to tell you a specific reason as to why we did this,” Datta said. “But over time, a bunch of us in the company were watching with curiosity and some degree of alarm as to what’s going on.” He defined those sources of alarm as what he called "antisocial behavior," including vandalism, looting, and actions like "breaking the glass of an Apple store.” He added that they were attempting to test if protests were being driven by outside agitators.Datta said that he and a few Mobilewalla employees chose locations where they expected protests would occur — including the George Floyd memorial site in Minneapolis, and Gracie Mansion in New York — and analyzed data from mobile devices in those areas collected between May 29 and May 31. Screenshot from "George Floyd Protester Demographics: Insights Across 4 Major US Cities."Jacinta González, a senior campaign organizer at Latinx advocacy group Mijente, told BuzzFeed News that by monitoring protesters, Mobilewalla could undermine freedom of assembly.“It is really just fundamentally terrifying to understand the way that companies can access such vast amounts of data to process for their own gain — without folks understanding even that they have consented to their information being taken, much less used in this way,” González said.“It’s important to understand that once technology hits the market, it's actually very hard to limit who has access to it — whether it is police, or whether it is other actors that want to harm communities,” González added. “Once this stuff is out there, we just have no way of understanding how it’s being used. Often we don’t even know that it’s out there to begin with.”Mobilewalla does not collect the data itself, but rather buys it from a variety of sources, including advertisers, data brokers, and internet service providers. Once it has it, the company uses artificial intelligence to turn a stew of location data, device IDs, and browser histories to predict a person's demographics — including race, age, gender, zip code, or personal interests. Mobilewalla sells aggregated versions of that stuff back to advertisers. On its website, Mobilewalla says that it works with companies across a variety of industries — like retail, dining, telecom, banking, consulting, health, and on-demand services (like ride-hailing). "Who would know that they’d be using it to track demographics of people at protests?" It’s unclear how accurate this report actually is. Datta told BuzzFeed News that his company, on average, has access to location data for 30% to 60% of people in any given location in the United States. Mobilewalla said in a YouTube video that it collects an average of 25 billion “signals” (or pieces of information, like GPS coordinates) every day. Every week, these signals pour in from an average of 1.6 billion devices. Datta said that about 300 million of these devices are in the US. (This doesn’t mean that Mobilewalla collected data on 300 million people, because one person might have more than one device that Mobilewalla is tracking.)Saira Hussain, a staff attorney for the Electronic Frontier Foundation, told BuzzFeed News that Mobilewalla’s report was not surprising, but very troubling.“If [this data] ends up in hands of the government, or if protesters are concerned that it could end up in the hands of the government, that may suppress speech, it may deter people from going to protests,” Hussain said.Mobilewalla's privacy policy says that people have the right to opt out of certain uses of their personal information. But it also says, "Even if you opt out, we, our Clients and third parties may still collect and use information regarding your activities on the Services, Properties, websites and/or applications and/or information from advertisements for other legal purposes as described herein."There is currently no federal law that regulates how companies like Mobilewalla — which buy and sell people’s data on the internet — can use people’s information. Hussain noted that information about data-sharing can be buried in the Terms of Service, there isn’t meaningful consent built into most privacy policies.“Given how many different industries that this company works within for targeted advertising, it seems that you probably wouldn’t know, once the information in the company’s hands, exactly what they’re gonna be using it for,” Hussain said. “And who would know that they’d be using it to track demographics of people at protests?” One of the key metrics for a search system is the indexing latency, the amount of time it takes for new information to become available in the search index. This metric is important because it determines how quickly new results show up. Not all search systems need to update their contents quickly. In a warehouse inventory system, for example, one daily update to its search index might be acceptable. At Twitter -- where people are constantly looking for the answer to “what’s happening” -- real-time search is a must. Until mid-2019, the indexing latency for Twitter’s search system was around 15 seconds. It was fast enough to power relevance-based product features such as the ranked Home timeline, which delivers Tweets based on their relevance. Since determining a Tweet's relevance is based on factors such as how much engagement it gets, there is less need for instant indexing. Use cases requiring a much lower indexing latency, however, couldn’t be powered by our search infrastructure. For example, we couldn’t use this same search system to retrieve Tweets for a person’s profile page where people generally expect to see their Tweets appear the moment they are published. Two main limitations prevented us from lowering our indexing latency: Some fields (bits of information associated with each Tweet) are not available when a Tweet is created. For example, a fully resolved URL usually provides a much better ranking signal than a shortened URL like http://t.co/foo. However, resolving new URLs takes time, and our old system required us to index all fields for a Tweet at the same time, so we needed to wait for all these delayed fields to become available. Most features in the Twitter application prioritize the newest relevant Tweets. Therefore, we sort our index by the time a Tweet was created. This would be easy to do if our systems received events that were strictly ordered. However, because our search system is decoupled from Tweet creation, the search system doesn’t necessarily receive Tweets in chronological order. To overcome this limitation, we added a buffer in our ingestion pipeline that stored all incoming Tweets for a few seconds, and sorted them in strictly increasing order of created time before sending them to our indexing system. Overcoming these limitations required major changes to our ingestion pipeline and our indexing system, but we believe the results were worth the effort. Tweets are now available for searching within one second of creation, which allows us to power product features with strict real-time requirements, such as real-time conversations or the profile pages. Let's take a closer look at how we've achieved this. Posting lists The core of almost all search systems is a data structure called an inverted index. An inverted index is designed to quickly answer questions like "Which documents have the word cat in them?". It does this by keeping a map from terms to posting lists. A term is typically a word, but is sometimes a conjunction, phrase, or number. A posting list is a list of document identifiers (or document IDs) containing the term. The posting list often includes extra information, like the position in the document where the term appears, or payloads to improve the relevance of our ranking algorithms.1 The search systems at Twitter process hundreds of thousands of queries per second and most involve searching through posting lists of thousands of items, making the speed at which we can iterate through a posting list a critical factor in serving queries efficiently. For example, consider how many Tweets contain the word “the.” Document identifiers We use Lucene as our core indexing technology. In standard Lucene, an index is subdivided into chunks called segments, and document IDs are Java integers. The first document indexed in a particular segment is assigned an ID of 0, and new document IDs are assigned sequentially. When searching through a segment, the search starts at the lowest document IDs in the segment and proceeds to the highest IDs in the segment. To support our requirement of searching for the newest Tweets first, we diverge from standard Lucene and assign document IDs from high to low: the first document in a segment is assigned a maximum ID (determined by how large we want our Lucene segment to be), and each new document gets a smaller document ID. This lets us traverse documents so that we retrieve the newest Tweets first, and terminate queries after we examine a client-specified number of hits. This decision is critical in reducing the amount of time it takes to evaluate a search query and therefore in letting us scale to extremely high request rates. When we were using sorted streams of incoming Tweets, it was easy to assign document IDs: the first Tweet in a segment would get the ID of size of the segment minus one, the second Tweet would get the size of the segment minus two, and so on, until we got to document ID 0. However, this document ID assignment scheme doesn’t work when the incoming stream isn’t sorted by the time a Tweet was created. In order to remove the delay added by sorting, we needed to come up with a new scheme. In the new document ID assignment scheme, each Tweet is assigned a document ID based on the time that it was created. We needed to fit our document IDs into a 31-bit space, because Lucene uses positive Java integers as document IDs. Each document ID is unique within a segment, and our segments are usually writable for about 12 hours. We decided to allocate 27 bits to store timestamps with millisecond granularity, which is enough for a segment to last for a bit more than 37 hours. We use the last four bits as a counter for the number of Tweets with the same timestamp. This means that if we get more than 24 (16) Tweets with the same millisecond timestamp, then some of them will be assigned a document ID that is in a slightly incorrect order. In practice, this is extremely rare, and we decided that this downside was acceptable because we often ran into a similar situation in the old system when a Tweet was delayed for more than 15 seconds, which also resulted in the assignment of an unordered document ID. How it used to work: Unrolled linked lists For the past eight years, the search systems at Twitter used a prepend-only unrolled linked list as the data structure backing the posting lists. This has allowed us to avoid the overhead of storing a pointer for every value and vastly improved the speed of traversing a posting list because it was cache friendly. (An unrolled linked list is a linked list with multiple values per link — there is a good description on Wikipedia.) In our old implementation, the linked list started out with a single value, and we allocated exponentially larger nodes each time the list needed to grow.This Tweet is unavailable Searcher threads would start at the most recently added item of the linked list and follow pointers until reaching the end of the list. Writers would only add new items to the start of the list, either by adding a new posting in the existing array or creating a new block and adding the new posting to the new block. After adding the item and setting up the links correctly, the writer would atomically update its pointer to the new head of the linked list. Searchers would either see the new pointer or the old pointer, but never an invalid pointer. The list was prepend-only so that we didn’t need to rewrite pointers internally and also because adding items to the middle of a block of postings would have required copying, slowing down the writer and requiring complex bookkeeping (like tracking which blocks were up to date, which were stale but still used by searchers, and which could be safely used for new postings). It also worked well with our old document ID assignment scheme, because it guaranteed that posting lists were always sorted by document ID. These linked lists supported a single writer and many searchers without using any locks, which was crucial for our systems: We had a searcher for every CPU core, with tens of cores per server, and locking out all of the searchers every time we needed to add a new document would have been prohibitively expensive. Prepending to a linked list was also very fast (O(1) in the size of the posting list) as it just required following a single pointer, allocating a new element, and updating the pointer. You can find more details on the unrolled linked list approach we used in Earlybird: Real-Time Search at Twitter. The new hotness: Skip lists The linked list data structure served our system well for many years: it was easy to understand and extremely efficient. Unfortunately, that scheme only works if the incoming Tweets are strictly ordered because you can’t insert new documents into the middle of the list. To support this new requirement, we used a new data structure: skip lists. Skip lists support O(log n) lookups and insertions into a sorted set or map, and are relatively easy to adapt to support concurrency. A skip list has multiple levels, and each level stores elements in a linked list. The lowest level contains all of the elements, the next highest level contains some fraction of those elements, and so on, until we reach the highest level which contains just a few elements. If an element is stored at level N, then it is also stored at all levels 1, 2, …, N - 1. Each of the elements in a higher level contains a link to the equivalent element in a lower level. To find a given element, you start out at the highest level, and read along that linked list until you find an element that is greater than the one you are looking for. At that point, you descend one level and start examining elements in the more densely populated linked list.This Tweet is unavailable When we add an element, we always add it to the bottom level. If we add the element at level N, we randomly decide to add the element at level N + 1 with 20% probability and continue recursively until we don't make the random decision to add an element at the next higher level. This gives us 1/5th as many elements at each higher level. We have found a 1:5 ratio to be a good tradeoff between memory usage and speed. We can implement the skip list in a single flat array where each “pointer” is just an index into the skip list array, like so:This Tweet is unavailable Note that each level of the skip list is terminated with a special “end of level” value, which signals to the search process that there are no higher values at this level of the skip list. Our skip list implementation has a few notable optimizations to reduce the amount of memory used in each posting list, improve search and indexing speed, and support concurrent reading and writing. First, our skip list always adds elements to the end of an allocation pool. We use this to implement document atomicity, described below. When a new element is added, first we allocate the element at the end of the pool, then we update the pointer at the lowest level of the skip list to include that item. Typically, this is the “head” pointer of the skip list, but if it's a larger document ID than the head of the skip list, we will traverse further down the skip list and insert the posting into the correct place. We also sometimes rewrite pointers in the higher levels of the skip list structure. There is a one-in-five chance that we add the value to the second level of the skip list, a one-in-25 chance that we add it to the third level, and so on. This is how we ensure that each level has one-fifth of the density of the level below it, and how we achieve logarithmic access time. Second, when we search for an element in a skip list, we track our descent down the levels of the skip list, and save that as a search finger. Then, if we later need to find the next posting with a document ID greater than or equal to a given value (assuming the value we are seeking is higher than the original value we found), we can start our search at the pointers given by the finger. This changes our lookup time from O(log n), where n is the number of elements in the posting list, to O(log d), where d is the number of elements in the posting list between the first value and the second value. This is critical because one of the most common and expensive queries in a search engine are conjunctions, which rely heavily on this operation. Third, many of the data structures in our search system store all of their data in an array of primitives to remove the overhead imposed by the JVM of storing a pointer to an object and to avoid the cost of having many live objects survive in the old generation, which could make garbage collection cycles take much more time. This makes coding data structures much more challenging, because every value in the array is untyped — it could be a pointer, a position, a payload, or completely garbled in the case of race conditions. Using programming language-level features like classes or structs in another language would make understanding and modifying these data structures far easier, so we are eagerly awaiting the results of OpenJDK's Project Valhalla. Fourth, we only allocate one pointer for every level of the skip list. In a typical skip list, a node will have a value, a pointer to the next largest element in the list and a pointer to the lower level of the skip list. This means that a new value allocated into the second level will require the space for two values and four pointers. We avoid this by always allocating skip list towers contiguously. Each level K pointer will only point to other level K pointers, so to extract the value associated with a level K pointer P, you read the value at P - K. Once you reach a node with a value greater than the one you are searching for, you go back to the previous node, and descend a level by simply subtracting one from the pointer. This lets us allocate a value into the second level by just consuming the space for the value and two pointers. It also reduces the amount of time we need to spend chasing pointers, because a pointer into a tower is likely to be on the same cache line as the lower pointers in that tower. One of the downsides of having a linked list between elements in a particular node instead of an array is that the skip list is much less cache friendly than an unrolled linked list or B-tree — every single element access requires chasing a pointer, which often results in a cache miss. This increased the amount of time it took to traverse a single posting list, especially the very dense ones. However, the tree structure and logarithmic access time actually improved the speed for queries that accessed sparse documents (conjunctions and phrase queries) and allowed us to reach nearly the exact same average query evaluation speed as the unrolled linked list. Document atomicity In any data storage and retrieval system, it’s important to ensure that operations happen atomically. In a search system, this means that either all of the terms in a document are indexed or none of them are. Otherwise, if a person who used Twitter searched for a negation, for example, "filter:images dogs AND -hot", and we were in the middle of adding the image annotation "I sure love hot dogs!" to the ”dogs” posting list but hadn’t added it to the “hot” posting list, the person might see a photo of a grilled sausage sandwich instead of a lovable canine. When using sequential document IDs and avoiding updates, it's easy to track which documents are visible to the searchers. A searcher thread reads from an atomic variable that marks the smallest visible document ID, and skips over any document identifiers that are smaller than that document ID. This let us decouple the logical level of consistency — the document — from the underlying data structures that support concurrency at a more granular level. In our new system, newer documents are no longer guaranteed to have smaller document IDs. To support atomic updates to the skip list posting lists, we take advantage of the fact that the skip lists allocate new values at the end of a given allocation pool. When a searcher is created, it atomically gets a copy of a data structure that tracks the current maximum pointer into each allocation pool, which we call the published pointer. Then, whenever the searcher traverses a posting list, if the address of a posting is greater than the published pointer, the searcher will skip over that posting, without returning the document as a potential match. This lets us update the underlying data structure while the searcher is running, but preserves our ability to atomically create and update documents, and achieves our goal of separating data structure level atomicity from the logical, application layer atomicity. Changes to our ingestion pipeline Because of the changes to our indexing system, we no longer needed our ingestion pipeline to have a buffer to sort all incoming Tweets. This was a big win for reducing our indexing latency, but we still had the delay caused by waiting for some fields to become available.This Tweet is unavailable We noticed that most Tweet data that we wanted to index was available as soon as the Tweet was created, and our clients often didn’t need the data in the delayed fields to be present in the index right away. So we decided that our ingestion pipeline shouldn’t wait for the data in these delayed fields to become available. Instead, it can send most of the Tweet data to our indexing system as soon as the Tweet is posted, and then send another update when the rest of the data becomes available.This Tweet is unavailable This approach allowed us to remove the final artificial delay in our ingestion pipeline, at the cost of very new Tweets not having complete data. Since most search use cases rely only on fields that are immediately available, we believe this was an acceptable price to pay. Rolling out the changes to all customers Rolling out a change of this magnitude isn’t easy. First, our ingestion pipeline became more complex, because we switched from one that was fully synchronous to one with synchronous and asynchronous parts. Second, changing the core data structure in our indexing server came with the risk of introducing obscure bugs that were virtually impossible to detect in unit tests. And third, it was hard to predict if returning newer Tweets would break any assumption in a customer's code. We decided that the best way to test our new ingestion pipeline (in addition to writing many new tests) was to deploy it alongside the old ingestion pipeline. This meant that during the rollout period we had two full copies of indexing data. This strategy allowed us to gradually migrate to the new ingestion pipeline, and at the same time, we could easily switch back to the old stream of data if there was a problem. We also set up a staging cluster of indexing servers that mirrored our production environment as closely as possible and started sending a fraction of our production requests to these servers in addition to production servers (a technique known as dark reads). This allowed us to stress test the changes to our core indexing data structures with real production traffic and load, with both the old and new streams of Tweet data. Once we were confident that the new data structures were working as expected, we reused this staging cluster to test the correctness of the data produced by the new ingestion pipeline. We set up this cluster to index the data produced by the new ingestion pipeline and compared the responses to production. Doing so revealed a few subtle bugs, which only affected dark traffic. Once we fixed them, we were ready to deploy the change to production. Since we didn’t know what assumptions our clients made in their code, we took a conservative approach. We added code to our indexing servers to not serve any Tweet posted in the last 15 seconds (our initial indexing latency) and gradually rolled out all changes on the indexing side. Once we were confident that the indexing changes worked as expected, we removed the old ingestion pipeline. We checked with customers to see if recent Tweets would cause any issues, and, fortunately, none of them relied on the 15-second delay. However, some customers relied on the delayed fields being indexed at the same time as the original document. Instead of adding special handling for queries from these clients in the search system, those customers added a filter to their queries to not include recent Tweets. After this, we were finally ready to remove the 15-second serving restriction we had previously added to our indexing servers. Conclusions Making changes to a data storage and retrieval system introduces unique challenges, especially when those systems serve hundreds of thousands of queries per second. To get low search indexing latency at Twitter, we needed to make significant changes to our ingestion pipeline and the core data structure used in our indexes, extensively test these changes, and carefully deploy them. We believe the effort was worth it: indexing a Tweet now takes one second, which allows us to satisfy the requirements of even the most real-time features in our product. Acknowledgements This work wouldn’t have been possible without continuous advice and support from all members of the Search Infrastructure team at Twitter: Alicia Vargas-Morawetz, Andrew McCree, Bogdan Gaza, Bogdan Kanivets, Daniel Young, Justin Leniger, Patrick Stover, Petko Minkov, and Will Hickey; as well as Juan Caicedo Carvajal, Juan Luis Belmonte Mendez, Paul Burstein, Reza Lotun, Yang Wu, and Yi Zhuang.This Tweet is unavailable 1 For example, consider the following Tweet: "SF is my favorite city, but I also like LA." Since "SF" appears at the beginning of the Tweet and "LA" appears at the end, it might make sense to assign a higher score to this Tweet when users search for "SF" than when they search for "LA." In this case, the score would be a payload.This Tweet is unavailable In April/May of 2010, I spent four weeks as a freelance writer for The Onion News Network.Besides looking great on a resume, it was an awesome gig and a huge learning experience. Not surprisingly, I get asked about it pretty often.So here's everything I know about how to get a job writing for The Onion, and what I learned from my (short) time doing it.How I Got a Job Freelance Writing for The OnionGetting a job writing for The Onion is not easy. But there's no big secret to it, either.The Onion has a jobs page on their website.Yep. That's the big story behind how I got hired to write for The Onion: I went on their website and applied.But here's the thing -- I did this at least twice with no luck before I got accepted.Here's how it worked back then:I would check the job listings frequently (obsessively), and every once in a blue moon, there'd be a position on there called Freelancer or Freelance Writer. You'd send in an email with your resume and samples, and they'd send back an application for you to complete. You had to return two things:A list of 20 headlines and concepts for video sketches. (All the job listings I ever saw were for The Onion News Network -- their video leg. I never saw anything for editorial positions.)A completed script for one of their news segment sketches based on ideas or concepts they provided.Like I said, I went through this process (the final product is often known in the industry as a "packet") at least twice with no success. It doesn't sound like much, but it is TOUGH. I labored over those 20 headlines like you wouldn't believe. Even the best ideas start to look like shit after you've stared at them for a week.And the scripts. Man, those were hard. They never gave you blockbuster concepts to work with, and I think that was on purpose. It was always some kind of dry, subtle joke -- they wanted to see what you could do and how far you would take it.But the third time I applied was a charm. I'll never forget opening my email and seeing these words:The Onion News Network would like to congratulate you on being selected as a Contributor for the IFC series. We had a huge number of applicants, and we are thrilled with how strong this group is.I remember texting my wife -- girlfriend at the time. She was in class and stepped into the hall to call me. We both basically screamed on the phone for a couple of minutes.Side Note: I haven't seen any writing positions listed on The Onion's website in years. Granted, I don't check as frequently anymore, but I think the only way to get in with them these days is to be recruited, know someone there, or come up through the ranks as an intern or writer's assistant.If you REALLY want to get a job writing for The Onion; email them. I've heard stories of people that have done this. They say they won't take anything unsolicited, and there's a 99% chance they won't, but who knows? Serendipity just might strike for you.What It Was Like to Write for The OnionThey put us to work right away. This was back in 2010, and The Onion was going into pre-production on its very first TV show -- The Onion News Network on IFC (The Independent Film Channel).It would be a lot like the Onion video content we knew and loved, except in a half hour format and with recurring anchors as characters.We were required to turn in 25 ideas a week; an "idea" being a headline or concept along with a short explanation of the joke and how the segment would play out.Some weeks, we'd get really brief feedback on our lists. Other weeks, not. Then, by mid week, they'd compile all the ideas they liked. Some would be marked down as "one liners", or jokes that would be pulled in to scroll across the bottom of the screen during the show. Others were segmented by where in the show they might fit in, with some of them designated to move to the scripting stage (where the staff writers would take over).The process moved fast. My job, along with the other freelancers, was to come up with ideas. We weren't really consulted or informed about anything else surrounding the show. No meetings to discuss casting. No phone calls to brainstorm around our jokes. It was our job to pump out creativity and let the full time staff filter everything out. Everything was done over email.In the end, only one of my ideas actually made it into the show -- unless I missed a one-liner somewhere along the line, which would be easy to do.Funny enough, the idea of mine that made it to production was one from my submission packet -- I'm pretty sure it's the reason they hired me. And they actually chose to open the entire series with it, which was pretty cool.I recorded the pilot when it aired and probably watched it a dozen times with my family. One of the coolest moments I've ever had as a writer.What I Learned Writing for The OnionThis was a huge learning experience for me in a very short amount of time. I could probably talk for days about everything I learned working as a writer for The Onion for four weeks, but if I had to boil it down, here's what I'd say:Ideas are cheap.Or, in their case, jokes. They made us come up with so many sketch ideas and headlines. Huge lists of them. And once they were dead, they were dead. There was no reusing ideas that didn't quite make the cut in previous weeks. No iterating on almost-there jokes. When an idea was rejected, it was time to move on. Always bigger and better jokes, or ideas, out there to be found.I also learned how to really dig into an idea to see if it had any potential. They were constantly pushing us to see past the headline. Is it just a funny sentence or is there room to explore and expand this into a 3 minute sketch? Is there substance?Creativity is a muscle.What I loved about The Onion's writing process was the sheer volume of it. 25 ideas a week. No excuses. Often times, I'd turn in more than 25, along with a big list of one-liners I knew weren't big enough for full sketches. For one thing, it forced me to dig deep and get past those first initial ideas that came easily. It also helped me learn The Onion's editorial voice through repetition, which is something that can't be overstated.I am not that funny.I had one great idea and I used it up in my application to get the job. I wrote a few other funny jokes throughout, but nowhere near the level of The Onion's top writers. I wish I could show you these lists of approved ideas I got in my inbox each week. Reading through them was an absolute blast; these guys and girls are so freaking talented and funny.At the end of the project, they opted to keep a few of the freelance writers on -- alas, I was not one of them. And I didn't expect to be. To be a professional humor writer, you have to be insanely funny and creative. I'm a good writer, and I got their voice, and I have a few good ideas, but I was never going to make it as a full-time writer for The Onion. And that's okay.In The EndWriting for The Onion was one of the coolest experiences I've ever had, and I feel lucky to have gotten the opportunity.I mean, I got to see an idea I had sitting in my living room acted out on TV with a real script, real actors, and real production value. That was awesome, and I'll never forget it. And to top it off, I actually got a paycheck. People paid me real money to come up with jokes. That was incredible.And I'd like to think that, even though the gig was short lived, I learned a ton by being bold enough to jump into the deep end with pro comedy writers.Even if I didn't really belong there. Microsoft is about to (mostly) give up on physical retail. Today the company announced plans to permanently close all Microsoft Store locations in the United States and around the world, except for four locations that will remain open and be “reimagined.” Those locations are New York City (Fifth Ave), London (Oxford Circus), Sydney (Westfield Sydney), and the Redmond campus location. The London store only just opened about a year ago. All other Microsoft Store locations across the United States and globally will be closing, and the company will concentrate on digital retail moving forward. Microsoft says Microsoft.com and the Xbox and Windows storefronts reach “up to 1.2 billion monthly customers in 190 markets.”The company has not specified whether layoffs will come as a result of the widespread store closures, only saying that “our commitment to growing and developing careers from this diverse talent pool is stronger than ever.”The decision partially explains why Microsoft had yet to reopen a single store after they were all closed in light of the COVID-19 pandemic. Last week, Microsoft told The Verge that its “approach for re-opening Microsoft Store locations is measured and cautious, guided by monitoring global data, listening to public health and safety experts, and tracking local government restrictions.” The company declined to offer an update on when any stores might open again.Since many Microsoft stores are in shopping centers and malls, the continued closure hasn’t stood out as unusual. In US states that are taking a cautious approach to restoring retail operations — to avoid a resurgence of the novel coronavirus — most malls remain closed. There have already been spikes of COVID-19 cases in regions with more relaxed guidelines, which has led Apple to re-close some stores where it had only recently welcomed customers back in.In April, Microsoft outlined in a blog post how many retail store associates had shifted to remote work after their everyday jobs were sidelined. The company has continued to provide regular pay for team members through the pandemic. “Our retail team members will continue to serve customers working from Microsoft corporate facilities or remotely and we will continue to develop our diverse team in support of the overall company mission and objectives,” the company said in today’s update. The Microsoft Store debuted in 2009 and closely adhered to Apple’s successful retail playbook. Each store is a showcase for the company’s Surface and Xbox hardware, plus a selection of third-party PCs. Employees were well-versed in all things Windows, and the company also offered in-store events, workshops, customer service, and repairs. www.prisma.io Comments ContentsTLDRAt Prisma, our goal is to revolutionize how application developers work with databases. Considering the vast number of different databases and variety of tools for working with them, this is an extremely ambitious goal!We are thrilled to enter the next chapter of pursuing this goal with a $12M Series A funding round led by Amplify Partners. We are especially excited about this partnership as Amplify is an experienced investor in the developer tooling ecosystem and has led investments for numerous companies, such as Datadog, Fastly, and Gremlin.Our mission: Making databases easyDatabase tools are stuck with legacy paradigmsDespite having been developed in the 1970s, relational databases are still the most commonly used databases today. While other database types have been developed in the meantime, from document, to graph , to key-value databases, working with databases remains one of the biggest challenges in application development.While almost any other part of the development stack has been modernized, database tools have been stuck with the same paradigms for the last decades.When working with relational databases, developers have the choice of working directly with SQL or using a higher-level abstraction called ORMs. None of these options is particularly compelling.Using SQL is very low-level resulting in reduced developer productivity. In contrast, ORMs are too high-level and developers sacrifice control over the executed database operations when using this approach. ORMs further suffer from a fundamentally misguided abstraction called the object-relational impedance mismatch.Prisma modernizes how developers work with databasesSimilar to how React.js modernized frontend development or how Serverless invented a new model for compute infrastructure, Prisma is here to bring a new and modern approach for working with databases!Prisma's unique approach to solving database access with a generated query builder that's fully type-safe and can be tailored to any database schema schema sets it apart from previous attempts of solving the same problem.A big part of the modernization comes from our major focus on developer experience. Database tools are often associated with friction, uncertainty, painful hours of debugging and costly performance bottlenecks.Developer experience is part of our DNA at Prisma. Working with databases is too often associated with friction and uncertainty when it should be fun, delightful and productive!We want to make working with databases fun, delightful and productive while guiding developers towards proper patterns and best practices in their daily work with databases!Learning from our past: From GraphQL to databasesAs a company we've gone through a number of major product iterations and pivots over the last years.Our initial products, Graphcool and Prisma 1 were focused on GraphQL as a technology. However, as we were running both tools in production, we realized they didn't address the core problems developers had.We realized that a lot of the value we provided with both tools didn't necessarily lie in the quick provision of a GraphQL server, but rather in the fact that developers didn't need to manage their database workflows explicitly.This realization led to a pivot which ultimately manifested in the rewrite to Prisma 2. With this new version of Prisma, we have found the right level of abstraction that ensures developers keep full control and flexibility about their development stack while not needing to worry about database workflows!Inspired by the data layers of big companies (Twitter, Facebook, ...)The approach Prisma takes for this modernization is inspired by big tech companies such as Twitter, Facebook, or Airbnb. To ensure productivity of application developers, it is a common practice in these organizations to introduce a unified data access layer that abstracts away the database infrastructure and provides developers with a more familiar and convenient way of accessing data.Facebook developed a system called TAO that fulfills the data needs of application developers. Twitter has built a "virtual database" called Strato which brings together multiple data sources so that they can be queries and mutated uniformly. Airbnb combines GraphQL and Thrift to abstract away the implementation details of querying data.Building these custom data access layers requires a lot of time and resources (as these are typically implemented by dedicated infrastructure teams) and thus is not a realistic approach for most companies and development teams.Being based on the same core ideas and principles as these systems, Prisma democratizes the pattern of a uniform data access layer and makes it accessible as an open-source technology for development teams of all sizes.Where we are todayPrisma 2.0 is ready for productionAfter running Preview and Beta versions for more than a year, we've recently launched Prisma 2.0 for production. Having rewritten the core of Prisma from Scala to Rust for the transition, we've built a strong foundation to expand the Prisma toolkit to cover various database workflows in the future.Prisma's main feature is Prisma Client, an auto-generated and type-safe query builder which can be used to access a database in Node.js and TypeScript. Thanks to introspection, Prisma Client can be used to work with any existing database!Note: Prisma currently supports PostgreSQL, MySQL and SQLite databases – with more planned. Please create new GitHub issues or subscribe to existing ones (e.g. for MongoDB or DynamoDB) if you'd like to see support for specific databases.Next-generation web frameworks are built on PrismaThe Node.js ecosystem is known for lots of different frameworks that try to streamline workflows and prescribe certain conventions. We are extremely humbled that many framework authors decide to use Prisma as their data layer of choice.Redwood.js: Bringing full-stack to the JamstackThe new RedwoodJS framework by GitHub co-founder Tom Preston-Werner seeks to become the "Ruby on Rails" equivalent for Node.js. RedwoodJS is based on React and GraphQL and comes with a baked-in deployment model for serverless functions.Blitz.js: The Fullstack React FrameworkAnother framework with increasing anticipation and excitement in the community is Blitz.js. Blitz is build on top of Next.js and takes a fundamentally different approach compared to Redwood. Its goal is to completely eliminate the API server and "bring back the simplicity of server rendered frameworks".Nexus: A delightful GraphQL application frameworkAt Prisma, we're huge fans of GraphQL and believe in its bright future. That's why we founded the Prisma Labs team which dedicates its time to work on open source tools in the GraphQL ecosystem.It is currently focused on building Nexus, a delightful application framework for developing GraphQL servers. As opposed to RedwoodJS, Nexus is a backend-only GraphQL framework and has no opinions on how you access the GraphQL API from the frontend.What's next for PrismaDatabase migrations with Prisma MigrateDatabase migrations are a common pain point for many developers! Especially with applications running in production, it is often unclear what the best approach is to perform schema changes (e.g. in CI/CD environments). Many developers resort to manual migrations or custom scripts, making the process brittle and error-prone.Prisma Migrate is our solution to this problem. Prisma Migrate lets developers map the declarative Prisma schema to their database. Under the hood, Prisma Migrate generates the required SQL statements to perform the migration.Note: Prisma Migrate is currently in an experimental state and should not be used in production environments.Prisma Studio: A visual editor for your database workflowsPrisma Studio is your visual companion for various database workflows. It provides a modern GUI that lets you view and edit the data in your database. You can switch between the table and the tree view, the latter is especially convenient to drill deeply into nested data (explore the two views using the tabs below or try out the online demo).Table viewTree viewNote: Prisma Studio is currently in an experimental state and should not be used in production environments. .Beyond Node.js TypeScript: Prisma Client in other languagesPrisma Client is a thin, language-specific layer that delegates the heavy-lifting of query planning and execution to Prisma's query engine. The query engine is written in Rust and runs as a standalone process alongside your main application.This architecture enables us to expand Prisma Client to other languages and bring its benefits to developers beyond the Node.js community. We are already working on Prisma Client in Go with a first alpha version ready to try out!Supporting a broad spectrum of databases and other data sourcesPrisma is designed in a way that it can potentially connect to any existing data source as long as there is the right connector for it!As of today, we've built connectors for PostgreSQL, MySQL and SQLite. A connector for MongoDB is already in the works and more are planned for the future.Building commercial services to sustain the OSS toolsWe are commited to building world-class open-source tools to solve common database problems of application developers. To be able to sustain our open-source work, we're planning to build commercial services that will enable development teams and organizations to collaborate better in projects that are using Prisma.Note: The plans for commercial services do not affect the open-source tools we are building, those will remain free forever.We

TAGS:Full Hacker News 

<<< Thank you for your visit >>>

Websites to related :
Christian Spotlight's Guide to G

  The Christian Gamer Code The Christian Gamer is a believer in Jesus Christ as the way to eternal salvation (John 3:16). As a Christian, responsible to

Celebrity Cruises | Luxury Cruis

  Alaskan snow-capped mountains and forest reflecting off the water. Alaska Coronavirus (COVID-19) Global Suspension of Cruising Learn More Has bee

国产福利不卡在线视频,国产亚洲观

  洛阳高新清涛石化设备有限公司座落于中国九朝古都—洛阳。公司位于洛阳市高新区青城路5号。是集石油化工设备研发,生产,销售,服务为一体的综合型企业。我公司拥有先

Custom Gaming PCs and Desktops f

  JavaScript seems to be disabled in your browser. For the best experience on our site, be sure to turn on Javascript in your browser. Welcome to UK Gam

Synnatschke Photography

  Great Reed Warbler celebrating spring Life has pretty much slowed down during the last couple of weeks/months. Instead of traveling around the world a

West Burlington Independent Scho

  Superintendent's MessageIt’s Not a Question of If, But A Question of When?I find such pride in the work of our staff, our students, our parents and t

Pekin CSD

  Pekin CSD will remain closed for the rest of the 2020 school year. Please continue to watch this page, as well as, our Facebook page - Pekin Schools f

Education Minnesota - Home

  NEW: Register for online orientation trainings What to know in the months ahead Find checklists, recommendations and other resources to safely and res

Quick-Service and Fast Casual Re

  Chuck E. Cheese Parent Company Files for Bankruptcy Reopened company-run stores will remain in operation. How to Win the Plant-Based Game As the pla

Noodle: The Smarter Way to Searc

  Expert Counseling92% of students who join us for a free 20-minute counseling session leave feeling more confident and excited about their future. If y

ads

Hot Websites